1 Introduction

[Needs more text]

In this document we present a work-flow for integration across different omics datasets.

[Note] This is not the final version of the document.

2 Packages and files

A package with regularized CCA and multiomics DIABLO method is mixOmics. Package igraph is needed for network analysis.

library(mixOmics)
library(igraph)

Package for complex heatmaps.

#BiocManager::install("ComplexHeatmap")
library(ComplexHeatmap)
# https://www.rdocumentation.org/packages/pheatmap/versions/1.0.12/topics/pheatmap
library(pheatmap)

2.1 Functions

Some original and adapted functions can be found in the file that is silently processed here.

%% Additional functions

out <- ""
out <- paste(out,knit_child("005-Functions.Rmd", quiet=TRUE))

2.2 File names

Usually we use a file management system under the pISA-tree framework. For simplicity, all files ( scripts, data ,… ) are in one directory.

Control of file source

usetest <- !TRUE

Sample description file (aka phenodata)

pfn <- "./input/phenodata_subset_2023-03-08.txt"
files <- c(
   "./input/data_hormonomics.txt"
 , "./input/data_metabolomics.txt"
 , "./input/data_qPCR.txt"
# , "./input/data_Phenomics.txt"
  , "./input/data_Proteomics.txt"
 )

Phenodata file name:

pfn
## [1] "./input/phenodata_subset_2023-03-08.txt"

Data file names

files 
## [1] "./input/data_hormonomics.txt"  "./input/data_metabolomics.txt"
## [3] "./input/data_qPCR.txt"         "./input/data_Proteomics.txt"

For future use and labeling, we need text names of dataset objects.

datanames <- c(
      "hormonomics"
    , "metabolomics"
    , "qPCR"
#    , "phenomics"
     , "proteomics"
)
datanames
## [1] "hormonomics"  "metabolomics" "qPCR"         "proteomics"

Define groups to consider. In a small example, we will pick two groups, based on treatment:

useTreatment <- c("C","H")

It is advisable to first read the phenodata, followed by actual data input. This enables smart selection of samples, based on the sample selection column with the assay name.

2.3 Phenodata

pfn
## [1] "./input/phenodata_subset_2023-03-08.txt"
#
phdata <- read.table(pfn
  , header = TRUE
  ,  sep = "\t"
  , stringsAsFactors = FALSE
  , row.names=1
  )
dim(phdata)
## [1] 56 30
names(phdata)
##  [1] "Treatment"                        
##  [2] "Harvest"                          
##  [3] "SamplingDay"                      
##  [4] "DaysOfStressH"                    
##  [5] "DaysOfStressD"                    
##  [6] "DaysOfStressW"                    
##  [7] "DaysRecovery"                     
##  [8] "Replicate"                        
##  [9] "Sample.type"                      
## [10] "Date"                             
## [11] "Heat.Drought.Water.Recovery.Days" 
## [12] "TreatmentxDatexPlant"             
## [13] "TreatmentxSamplingDay"            
## [14] "TreatmentxSamplingDayxReplicateNo"
## [15] "Comment"                          
## [16] "Num_Tubers"                       
## [17] "Total_Tubers_Weight"              
## [18] "FW_SH_Total"                      
## [19] "DW_SH_Total"                      
## [20] "Fv.Fm"                            
## [21] "qL"                               
## [22] "deltaT"                           
## [23] "TOP.AREA"                         
## [24] "COMPACTNESS"                      
## [25] "WATER.CONSUMPTION"                
## [26] "Proteomics"                       
## [27] "Transcriptomics"                  
## [28] "Metabolomics"                     
## [29] "Hormonomics"                      
## [30] "X_A_300_OverView.R"
pdata <- phdata

Filter out groups

pdata$Treatment
##  [1] "C"  "C"  "C"  "C"  "C"  "C"  "C"  "C"  "C"  "C"  "C"  "C"  "C" 
## [14] "C"  "C"  "C"  "D"  "D"  "D"  "D"  "D"  "D"  "D"  "D"  "H"  "H" 
## [27] "H"  "H"  "H"  "H"  "H"  "H"  "H"  "H"  "H"  "H"  "H"  "H"  "H" 
## [40] "H"  "HD" "HD" "HD" "HD" "HD" "HD" "HD" "HD" "W"  "W"  "W"  "W" 
## [53] "W"  "W"  "W"  "W"
pdata <- pdata[pdata$Treatment %in% useTreatment,]
pdata$Treatment
##  [1] "C" "C" "C" "C" "C" "C" "C" "C" "C" "C" "C" "C" "C" "C" "C" "C"
## [17] "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H" "H"
dim(pdata)
## [1] 32 30

Overview of selected samples:

addmargins(table(pdata$Treatment, pdata$SamplingDay))
##      
##        1  7  8 14 Sum
##   C    4  4  4  4  16
##   H    4  4  4  4  16
##   Sum  8  8  8  8  32
.treat <- unique(pdata$Treatment)
.days <- unique(pdata$SamplingDay)
.entry <- 0.5
summary(pdata)
##   Treatment            Harvest      SamplingDay   DaysOfStressH  
##  Length:32          Min.   :1.00   Min.   : 1.0   Min.   : 0.00  
##  Class :character   1st Qu.:1.75   1st Qu.: 5.5   1st Qu.: 0.00  
##  Mode  :character   Median :2.50   Median : 7.5   Median : 0.50  
##                     Mean   :2.50   Mean   : 7.5   Mean   : 3.75  
##                     3rd Qu.:3.25   3rd Qu.: 9.5   3rd Qu.: 7.25  
##                     Max.   :4.00   Max.   :14.0   Max.   :14.00  
##  DaysOfStressD DaysOfStressW  DaysRecovery   Replicate   
##  Min.   :0     Min.   :0     Min.   :0     Min.   :1.00  
##  1st Qu.:0     1st Qu.:0     1st Qu.:0     1st Qu.:1.75  
##  Median :0     Median :0     Median :0     Median :2.50  
##  Mean   :0     Mean   :0     Mean   :0     Mean   :2.50  
##  3rd Qu.:0     3rd Qu.:0     3rd Qu.:0     3rd Qu.:3.25  
##  Max.   :0     Max.   :0     Max.   :0     Max.   :4.00  
##  Sample.type            Date          
##  Length:32          Length:32         
##  Class :character   Class :character  
##  Mode  :character   Mode  :character  
##                                       
##                                       
##                                       
##  Heat.Drought.Water.Recovery.Days TreatmentxDatexPlant
##  Length:32                        Length:32           
##  Class :character                 Class :character    
##  Mode  :character                 Mode  :character    
##                                                       
##                                                       
##                                                       
##  TreatmentxSamplingDay TreatmentxSamplingDayxReplicateNo
##  Length:32             Length:32                        
##  Class :character      Class :character                 
##  Mode  :character      Mode  :character                 
##                                                         
##                                                         
##                                                         
##    Comment          Num_Tubers     Total_Tubers_Weight  FW_SH_Total
##  Length:32          Mode:logical   Mode:logical        Min.   :1   
##  Class :character   NA's:32        NA's:32             1st Qu.:1   
##  Mode  :character                                      Median :1   
##                                                        Mean   :1   
##                                                        3rd Qu.:1   
##                                                        Max.   :1   
##   DW_SH_Total     Fv.Fm         qL        deltaT     TOP.AREA
##  Min.   :1    Min.   :1   Min.   :1   Min.   :1   Min.   :1  
##  1st Qu.:1    1st Qu.:1   1st Qu.:1   1st Qu.:1   1st Qu.:1  
##  Median :1    Median :1   Median :1   Median :1   Median :1  
##  Mean   :1    Mean   :1   Mean   :1   Mean   :1   Mean   :1  
##  3rd Qu.:1    3rd Qu.:1   3rd Qu.:1   3rd Qu.:1   3rd Qu.:1  
##  Max.   :1    Max.   :1   Max.   :1   Max.   :1   Max.   :1  
##   COMPACTNESS WATER.CONSUMPTION   Proteomics Transcriptomics
##  Min.   :1    Min.   :1         Min.   :1    Min.   :1      
##  1st Qu.:1    1st Qu.:1         1st Qu.:1    1st Qu.:1      
##  Median :1    Median :1         Median :1    Median :1      
##  Mean   :1    Mean   :1         Mean   :1    Mean   :1      
##  3rd Qu.:1    3rd Qu.:1         3rd Qu.:1    3rd Qu.:1      
##  Max.   :1    Max.   :1         Max.   :1    Max.   :1      
##   Metabolomics  Hormonomics X_A_300_OverView.R
##  Min.   :1     Min.   :1    Min.   : 1.0      
##  1st Qu.:1     1st Qu.:1    1st Qu.: 5.5      
##  Median :1     Median :1    Median : 7.5      
##  Mean   :1     Mean   :1    Mean   : 7.5      
##  3rd Qu.:1     3rd Qu.:1    3rd Qu.: 9.5      
##  Max.   :1     Max.   :1    Max.   :14.0
apply(pdata,2,function(x) summary(as.factor(x)))
## $Treatment
##  C  H 
## 16 16 
## 
## $Harvest
## 1 2 3 4 
## 8 8 8 8 
## 
## $SamplingDay
##  1  7  8 14 
##  8  8  8  8 
## 
## $DaysOfStressH
##  0  1  7  8 14 
## 16  4  4  4  4 
## 
## $DaysOfStressD
##  0 
## 32 
## 
## $DaysOfStressW
##  0 
## 32 
## 
## $DaysRecovery
##  0 
## 32 
## 
## $Replicate
## 1 2 3 4 
## 8 8 8 8 
## 
## $Sample.type
##  S 
## 32 
## 
## $Date
## 2020-11-04 2020-11-10 2020-11-11 2020-11-17 
##          8          8          8          8 
## 
## $Heat.Drought.Water.Recovery.Days
##  0_0_0_0  1_0_0_0 14_0_0_0  7_0_0_0  8_0_0_0 
##       16        4        4        4        4 
## 
## $TreatmentxDatexPlant
## C_2020-11-04_1 C_2020-11-04_2 C_2020-11-04_3 C_2020-11-04_4 
##              1              1              1              1 
## C_2020-11-10_1 C_2020-11-10_2 C_2020-11-10_3 C_2020-11-10_4 
##              1              1              1              1 
## C_2020-11-11_1 C_2020-11-11_2 C_2020-11-11_3 C_2020-11-11_4 
##              1              1              1              1 
## C_2020-11-17_1 C_2020-11-17_2 C_2020-11-17_3 C_2020-11-17_4 
##              1              1              1              1 
## H_2020-11-04_1 H_2020-11-04_2 H_2020-11-04_3 H_2020-11-04_4 
##              1              1              1              1 
## H_2020-11-10_1 H_2020-11-10_2 H_2020-11-10_3 H_2020-11-10_4 
##              1              1              1              1 
## H_2020-11-11_1 H_2020-11-11_2 H_2020-11-11_3 H_2020-11-11_4 
##              1              1              1              1 
## H_2020-11-17_1 H_2020-11-17_2 H_2020-11-17_3 H_2020-11-17_4 
##              1              1              1              1 
## 
## $TreatmentxSamplingDay
##  C_1 C_14  C_7  C_8  H_1 H_14  H_7  H_8 
##    4    4    4    4    4    4    4    4 
## 
## $TreatmentxSamplingDayxReplicateNo
##  C_1_1  C_1_2  C_1_3  C_1_4 C_14_1 C_14_2 C_14_3 C_14_4  C_7_1  C_7_2 
##      1      1      1      1      1      1      1      1      1      1 
##  C_7_3  C_7_4  C_8_1  C_8_2  C_8_3  C_8_4  H_1_1  H_1_2  H_1_3  H_1_4 
##      1      1      1      1      1      1      1      1      1      1 
## H_14_1 H_14_2 H_14_3 H_14_4  H_7_1  H_7_2  H_7_3  H_7_4  H_8_1  H_8_2 
##      1      1      1      1      1      1      1      1      1      1 
##  H_8_3  H_8_4 
##      1      1 
## 
## $Comment
## only for cor 
##           32 
## 
## $Num_Tubers
## NA's 
##   32 
## 
## $Total_Tubers_Weight
## NA's 
##   32 
## 
## $FW_SH_Total
##  1 
## 32 
## 
## $DW_SH_Total
##  1 
## 32 
## 
## $Fv.Fm
##  1 
## 32 
## 
## $qL
##  1 
## 32 
## 
## $deltaT
##  1 
## 32 
## 
## $TOP.AREA
##  1 
## 32 
## 
## $COMPACTNESS
##  1 
## 32 
## 
## $WATER.CONSUMPTION
##  1 
## 32 
## 
## $Proteomics
##  1 
## 32 
## 
## $Transcriptomics
##  1 
## 32 
## 
## $Metabolomics
##  1 
## 32 
## 
## $Hormonomics
##  1 
## 32 
## 
## $X_A_300_OverView.R
##  1  7  8 14 
##  8  8  8  8

2.4 Process data files

For this project we aim to integrate several multi-omics datasets. We have data on hormonomics, metabolomics, and qPCR:

%% {r} %% hormonomics <- read.table(file1, header=TRUE, sep="\t") %% metabolomics <- read.table(file2, header=TRUE, sep="\t") %% qPCR <- read.table(file3, header=TRUE, sep="\t") %%

Read all datafiles and assign them to named objects

for (i in 1:length(files)){
  print(datanames[i])
  assign(datanames[i], read.table(files[i], header=TRUE, sep="\t"))
  print(head(get(datanames[i])))
}
## [1] "hormonomics"
##   SampleID      IAA    oxIAA  IAA.Asp      ABA        PA       DPA
## 1   C_S1_1 37.19565 61.49305 2.224072 36.78497  91.99991  45.63410
## 2   C_S1_2 45.86649 67.84222 1.990994 33.11733  92.61196  62.10651
## 3   C_S1_3 47.60516 52.49759 1.496438 41.66091  93.83119  55.06159
## 4   C_S1_4 37.55562 48.69611 2.240201 39.11995  91.35024  59.82576
## 5   C_S7_1 49.46230 76.79460 1.927162 80.00281 231.68757 178.08837
## 6   C_S7_2 78.91379 93.14059 1.856353 49.76028 166.08573 149.86757
##          SA       JA    JA.Ile X9.10.dhJA X12.OH.JA   cisOPDA
## 1  505.2129 2.687303 0.5525212   5.449105  16.13451  353.8608
## 2  519.3793 2.802456 0.5661384   3.700174  23.95458  402.7049
## 3  275.2972 4.761132 0.4273195   2.879761  27.93628  644.9688
## 4  628.0462 4.622662 0.2028412   5.165337  25.72584  750.3532
## 5 1314.9488 6.101443 0.6226329   5.010268 244.12270 1294.9082
## 6  731.2122 3.092070 0.3449961   8.373161 203.74167  873.4217
## [1] "metabolomics"
##   SampleID Glukose Fructose Sucrose Starch    Asp    Glu   Asn   Ser
## 1   C_S1_1    2.13     2.70   3.449  22.06 1039.5 2514.0 177.8 597.9
## 2   C_S1_2    2.20     2.90   3.382  12.74  844.3 1966.4 167.8 498.1
## 3   C_S1_3    0.82     1.59   2.452   9.98  887.2 2067.8 167.0 441.1
## 4   C_S1_4    2.55     3.01   4.057  15.13  987.6 2347.9 172.3 537.6
## 5   C_S7_1    4.77     3.97   3.533  16.25  793.2 2109.0 276.9 368.5
## 6   C_S7_2    6.30     7.16   6.280   9.57 1391.6 3344.4 498.8 704.2
##     Gln   Gly  His  Arg   Thr   Ala   Pro  Tyr  Val Met  Ile  Lys
## 1 497.8 137.8 20.7 26.9 225.4 702.5  48.6 25.3 54.1 7.3 48.8 26.0
## 2 408.8 104.6 13.3 29.1 208.5 495.5  57.3 22.1 51.3 6.9 47.0 21.8
## 3 400.1  92.1 17.1 29.5 216.1 514.6  53.5 24.4 52.9 8.6 54.3 26.5
## 4 465.7 117.2 16.8 24.9 252.0 653.1  58.6 22.3 58.4 7.9 52.9 26.7
## 5 265.7  68.7 13.7 38.3 188.6 296.4  85.8 31.2 73.1 4.7 60.4 28.4
## 6 680.9 101.2 19.4 63.9 302.4 664.6 135.7 40.6 96.3 8.6 80.6 33.1
##    Leu   Phe
## 1 18.8 119.0
## 2 15.7  88.0
## 3 19.5  86.8
## 4 20.1 111.0
## 5 27.4  98.3
## 6 27.1 171.2
## [1] "qPCR"
##   SampleID    RbohA    SnRK2     ACO2    HSP70     PR1b    RD29B
## 1   C_S1_1 1.346213 1.497998 0.150379 1.013451 0.324007 0.016995
## 2   C_S1_2 1.496014 1.627412 0.196017 1.067230 0.154344 0.046504
## 3   C_S1_3 1.262812 1.630533 0.440406 0.883357 0.269488 0.016995
## 4   C_S1_4 1.428252 1.530426 0.176572 1.042068 0.255647 0.082387
## 5   C_S7_1 1.404150 1.122600 0.537281 1.028170 0.226714 1.751881
## 6   C_S7_2 0.848707 0.725752 0.150857 0.640827 0.192341 0.113202
##    X13.LOX     P5CS     ERF1     CAT1       CO    SWEET     SP6A
## 1 0.699873 3.647878 0.560424 0.639920 2.421594 0.816298 0.122374
## 2 0.660413 3.264675 0.637808 0.635107 5.563090 1.873813 0.238811
## 3 0.765764 2.151917 0.585558 0.687279 2.740914 0.930861 0.330151
## 4 0.734167 3.155558 0.663618 0.704880 2.469193 0.933900 0.122374
## 5 1.136201 0.862709 1.634810 0.811006 1.709219 1.494808 3.386457
## 6 0.540666 0.908966 1.239857 0.373886 1.580358 0.853875 0.810631
##     M0ZJG3
## 1 1.211115
## 2 1.376397
## 3 1.007479
## 4 0.903495
## 5 2.357865
## 6 1.387190
## [1] "proteomics"
##   SampleID VdnDe2_86784 PBdnRY1_7025 VdnPW5_1541 CLCdnPW10_11437
## 1   C_S1_1  0.000413884  0.000469004 0.002308758              NA
## 2   C_S1_2  0.000230027  0.000469191 0.002352451     0.000300619
## 3   C_S1_3  0.000449712  0.000524164 0.001314037              NA
## 4   C_S1_4  0.000255355  0.000520852 0.001661846     0.000222480
## 5   C_S7_1  0.000178758           NA 0.000498582     0.000622978
## 6   C_S7_2  0.000226049  0.000307384 0.000910694     0.000787786
##   CLCdnDe2_96512 VdnPW1_81435 CLCdnDe6_4953 CLCdnDe13_5738
## 1             NA     2.82e-05   0.000386853             NA
## 2             NA     2.35e-05   0.000322506             NA
## 3             NA     2.62e-05   0.000520422             NA
## 4             NA           NA   0.000397796             NA
## 5             NA           NA   0.000222778             NA
## 6             NA           NA   0.000563427             NA
##   CLCdnPW12_1472 CLCdnPW49_8560 VdnPW4_18801 SdnRY1_10712
## 1    0.000767766    0.000039000  0.000203446  0.000272489
## 2    0.000474884             NA  0.000197874  0.000265026
## 3    0.000530523    0.000072600  0.000189478  0.000338374
## 4    0.000584473    0.000036100  0.000188281  0.000336236
## 5    0.000288815    0.000101007  0.000439347  0.000470757
## 6    0.000432853             NA  0.000407422  0.000347256
##   CLCdnDe2_311 CLCdnDe13_63179 CLCdnDe10_24187 TDe_comp110944_c2_seq4
## 1  0.000257351     0.000135326     0.000119484            0.000050600
## 2  0.000178787     0.000037600     0.000033200                     NA
## 3  0.000159788     0.000042000     0.000037100            0.000094200
## 4  0.000238167     0.000041700     0.000073700            0.000046800
## 5  0.000222302     0.000116896              NA            0.000131029
## 6  0.000187408     0.000246367              NA            0.000165693
##   PBdnRY1_11085 CLCdnPW48_15164 CLCdnPW31_956 VdnDe1_162036
## 1      4.44e-05     0.000263690   0.000334185   0.000146521
## 2      1.85e-05     0.000302265   0.000167159   0.000122150
## 3      4.14e-05     0.000276283   0.000145245   0.000091000
## 4            NA     0.000274538   0.000247419   0.000180799
## 5            NA     0.000427083   0.000288672   0.000253132
## 6            NA     0.000216027   0.000243360   0.000160049
##   PBdnRY1_2239 CLCdnPW22_15452 CLCdnPW50_2470 VdnPW3_37011 SdnPW1_46
## 1  0.000165383     0.000161016             NA  0.000123402  1.49e-05
## 2  0.000068900     0.000089500             NA  0.000085700        NA
## 3  0.000057800     0.000100000             NA  0.000057500        NA
## 4  0.000057400     0.000149014             NA  0.000114203  1.38e-05
## 5  0.000107145              NA             NA  0.000106596        NA
## 6  0.000045200     0.000058600             NA  0.000112329  3.26e-05
##   CLCdnDe7_34340 CLCdnDe11_2286 VdnDe2_53978 CLCdnDe7_6144
## 1             NA             NA  0.000323764   0.000176082
## 2             NA             NA  0.000337389   0.000366984
## 3             NA             NA  0.000452302   0.000245989
## 4             NA             NA  0.000524353   0.000244435
## 5             NA    0.000369451           NA   0.000228152
## 6             NA    0.000311459  0.001149384   0.000192340
##   CLCdnRY10_6741 PBdnRY1_5389 VdnDe2_29290 VdnDe4_115100
## 1    0.000141861  0.000708110  0.000141561   0.000711383
## 2    0.000059100  0.000641660  0.000094400   0.000773551
## 3    0.000066100  0.000602145  0.000158210   0.000547316
## 4    0.000065600  0.000598341  0.000183412   0.000572483
## 5             NA  0.000877616  0.000073400   0.001041977
## 6    0.000077500  0.000807118  0.000092800   0.001047347
##   Sotub11g025210.1.1 VdnDe1_39992 VdnPW4_15664
## 1        0.000228539  0.000389774           NA
## 2        0.000285788  0.000541569           NA
## 3        0.000319272  0.000423515           NA
## 4        0.000211504  0.000360720           NA
## 5        0.000197414  0.000505036           NA
## 6        0.000332853  0.000425761           NA

Check if all 4 objects are created

datanames
## [1] "hormonomics"  "metabolomics" "qPCR"         "proteomics"
datanames %in% ls()
## [1] TRUE TRUE TRUE TRUE

2.5 Combine datasets

Datasets for DIABLO need to be collected in a list, with rows corresponding to the same samples. The order of samples from shrinked phenodata will be enforced.

The first component of the list will be a grouping variable, indicating the conditions. We will create reasonable names for groups.

sday <- paste0(0,pdata$SamplingDay)
len <- nchar(sday)
sday <- substr(sday,len-1,len)
trt <- pdata$Treatment
what <- paste(trt,sday,sep="")
what
##  [1] "C01" "C01" "C01" "C01" "C07" "C07" "C07" "C07" "C08" "C08" "C08"
## [12] "C08" "C14" "C14" "C14" "C14" "H01" "H01" "H01" "H01" "H07" "H07"
## [23] "H07" "H07" "H08" "H08" "H08" "H08" "H14" "H14" "H14" "H14"
X <- list(status= what)
names(X[[1]]) <- rownames(pdata)
X
## $status
##  C_S1_1  C_S1_2  C_S1_3  C_S1_4  C_S7_1  C_S7_2  C_S7_3  C_S7_4 
##   "C01"   "C01"   "C01"   "C01"   "C07"   "C07"   "C07"   "C07" 
##  C_S8_1  C_S8_2  C_S8_3  C_S8_4 C_S14_1 C_S14_2 C_S14_3 C_S14_4 
##   "C08"   "C08"   "C08"   "C08"   "C14"   "C14"   "C14"   "C14" 
##  H_S1_1  H_S1_2  H_S1_3  H_S1_4  H_S7_1  H_S7_2  H_S7_3  H_S7_4 
##   "H01"   "H01"   "H01"   "H01"   "H07"   "H07"   "H07"   "H07" 
##  H_S8_1  H_S8_2  H_S8_3  H_S8_4 H_S14_1 H_S14_2 H_S14_3 H_S14_4 
##   "H08"   "H08"   "H08"   "H08"   "H14"   "H14"   "H14"   "H14"
print(addmargins(table(pdata$SamplingDay, what)), zero.print=".")
##      what
##       C01 C07 C08 C14 H01 H07 H08 H14 Sum
##   1     4   .   .   .   4   .   .   .   8
##   7     .   4   .   .   .   4   .   .   8
##   8     .   .   4   .   .   .   4   .   8
##   14    .   .   .   4   .   .   .   4   8
##   Sum   4   4   4   4   4   4   4   4  32
print(addmargins(table(pdata$Treatment, what)), zero.print=".")
##      what
##       C01 C07 C08 C14 H01 H07 H08 H14 Sum
##   C     4   4   4   4   .   .   .   .  16
##   H     .   .   .   .   4   4   4   4  16
##   Sum   4   4   4   4   4   4   4   4  32

Put datasets into the list X and ensure that they all have same order of samples as in phenodata.

datanames
## [1] "hormonomics"  "metabolomics" "qPCR"         "proteomics"
i <- 1
for(i in 1:length(datanames)){
x <- get(datanames[i])
rownames(x) <- x[,1]
x <- x[,-1]
X[[i+1]] <- x[rownames(pdata),]
names(X)[i+1] <- datanames[i]
}
str(X)
## List of 5
##  $ status      : Named chr [1:32] "C01" "C01" "C01" "C01" ...
##   ..- attr(*, "names")= chr [1:32] "C_S1_1" "C_S1_2" "C_S1_3" "C_S1_4" ...
##  $ hormonomics :'data.frame':    32 obs. of  12 variables:
##   ..$ IAA       : num [1:32] 37.2 45.9 47.6 37.6 49.5 ...
##   ..$ oxIAA     : num [1:32] 61.5 67.8 52.5 48.7 76.8 ...
##   ..$ IAA.Asp   : num [1:32] 2.22 1.99 1.5 2.24 1.93 ...
##   ..$ ABA       : num [1:32] 36.8 33.1 41.7 39.1 80 ...
##   ..$ PA        : num [1:32] 92 92.6 93.8 91.4 231.7 ...
##   ..$ DPA       : num [1:32] 45.6 62.1 55.1 59.8 178.1 ...
##   ..$ SA        : num [1:32] 505 519 275 628 1315 ...
##   ..$ JA        : num [1:32] 2.69 2.8 4.76 4.62 6.1 ...
##   ..$ JA.Ile    : num [1:32] 0.553 0.566 0.427 0.203 0.623 ...
##   ..$ X9.10.dhJA: num [1:32] 5.45 3.7 2.88 5.17 5.01 ...
##   ..$ X12.OH.JA : num [1:32] 16.1 24 27.9 25.7 244.1 ...
##   ..$ cisOPDA   : num [1:32] 354 403 645 750 1295 ...
##  $ metabolomics:'data.frame':    32 obs. of  22 variables:
##   ..$ Glukose : num [1:32] 2.13 2.2 0.82 2.55 4.77 6.3 7.24 3.09 6.34 9.49 ...
##   ..$ Fructose: num [1:32] 2.7 2.9 1.59 3.01 3.97 7.16 5.41 4.04 8.52 7.75 ...
##   ..$ Sucrose : num [1:32] 3.45 3.38 2.45 4.06 3.53 ...
##   ..$ Starch  : num [1:32] 22.06 12.74 9.98 15.13 16.25 ...
##   ..$ Asp     : num [1:32] 1040 844 887 988 793 ...
##   ..$ Glu     : num [1:32] 2514 1966 2068 2348 2109 ...
##   ..$ Asn     : num [1:32] 178 168 167 172 277 ...
##   ..$ Ser     : num [1:32] 598 498 441 538 368 ...
##   ..$ Gln     : num [1:32] 498 409 400 466 266 ...
##   ..$ Gly     : num [1:32] 137.8 104.6 92.1 117.2 68.7 ...
##   ..$ His     : num [1:32] 20.7 13.3 17.1 16.8 13.7 19.4 20.5 17.8 8.8 18.4 ...
##   ..$ Arg     : num [1:32] 26.9 29.1 29.5 24.9 38.3 63.9 38.6 47.3 54.7 67.2 ...
##   ..$ Thr     : num [1:32] 225 208 216 252 189 ...
##   ..$ Ala     : num [1:32] 702 496 515 653 296 ...
##   ..$ Pro     : num [1:32] 48.6 57.3 53.5 58.6 85.8 ...
##   ..$ Tyr     : num [1:32] 25.3 22.1 24.4 22.3 31.2 40.6 50.9 37.1 20.2 34.6 ...
##   ..$ Val     : num [1:32] 54.1 51.3 52.9 58.4 73.1 ...
##   ..$ Met     : num [1:32] 7.3 6.9 8.6 7.9 4.7 8.6 4.5 6.6 3.8 2.2 ...
##   ..$ Ile     : num [1:32] 48.8 47 54.3 52.9 60.4 80.6 62.7 63 29.9 48.3 ...
##   ..$ Lys     : num [1:32] 26 21.8 26.5 26.7 28.4 33.1 29.2 38.4 20.6 41.8 ...
##   ..$ Leu     : num [1:32] 18.8 15.7 19.5 20.1 27.4 27.1 21.3 26.3 38.5 45.7 ...
##   ..$ Phe     : num [1:32] 119 88 86.8 111 98.3 ...
##  $ qPCR        :'data.frame':    32 obs. of  14 variables:
##   ..$ RbohA  : num [1:32] 1.35 1.5 1.26 1.43 1.4 ...
##   ..$ SnRK2  : num [1:32] 1.5 1.63 1.63 1.53 1.12 ...
##   ..$ ACO2   : num [1:32] 0.15 0.196 0.44 0.177 0.537 ...
##   ..$ HSP70  : num [1:32] 1.013 1.067 0.883 1.042 1.028 ...
##   ..$ PR1b   : num [1:32] 0.324 0.154 0.269 0.256 0.227 ...
##   ..$ RD29B  : num [1:32] 0.017 0.0465 0.017 0.0824 1.7519 ...
##   ..$ X13.LOX: num [1:32] 0.7 0.66 0.766 0.734 1.136 ...
##   ..$ P5CS   : num [1:32] 3.648 3.265 2.152 3.156 0.863 ...
##   ..$ ERF1   : num [1:32] 0.56 0.638 0.586 0.664 1.635 ...
##   ..$ CAT1   : num [1:32] 0.64 0.635 0.687 0.705 0.811 ...
##   ..$ CO     : num [1:32] 2.42 5.56 2.74 2.47 1.71 ...
##   ..$ SWEET  : num [1:32] 0.816 1.874 0.931 0.934 1.495 ...
##   ..$ SP6A   : num [1:32] 0.122 0.239 0.33 0.122 3.386 ...
##   ..$ M0ZJG3 : num [1:32] 1.211 1.376 1.007 0.903 2.358 ...
##  $ proteomics  :'data.frame':    32 obs. of  36 variables:
##   ..$ VdnDe2_86784          : num [1:32] 0.000414 0.00023 0.00045 0.000255 0.000179 ...
##   ..$ PBdnRY1_7025          : num [1:32] 0.000469 0.000469 0.000524 0.000521 NA ...
##   ..$ VdnPW5_1541           : num [1:32] 0.002309 0.002352 0.001314 0.001662 0.000499 ...
##   ..$ CLCdnPW10_11437       : num [1:32] NA 0.000301 NA 0.000222 0.000623 ...
##   ..$ CLCdnDe2_96512        : num [1:32] NA NA NA NA NA ...
##   ..$ VdnPW1_81435          : num [1:32] 2.82e-05 2.35e-05 2.62e-05 NA NA NA NA NA NA NA ...
##   ..$ CLCdnDe6_4953         : num [1:32] 0.000387 0.000323 0.00052 0.000398 0.000223 ...
##   ..$ CLCdnDe13_5738        : num [1:32] NA NA NA NA NA NA 4.72e-05 NA NA NA ...
##   ..$ CLCdnPW12_1472        : num [1:32] 0.000768 0.000475 0.000531 0.000584 0.000289 ...
##   ..$ CLCdnPW49_8560        : num [1:32] 3.90e-05 NA 7.26e-05 3.61e-05 1.01e-04 ...
##   ..$ VdnPW4_18801          : num [1:32] 0.000203 0.000198 0.000189 0.000188 0.000439 ...
##   ..$ SdnRY1_10712          : num [1:32] 0.000272 0.000265 0.000338 0.000336 0.000471 ...
##   ..$ CLCdnDe2_311          : num [1:32] 0.000257 0.000179 0.00016 0.000238 0.000222 ...
##   ..$ CLCdnDe13_63179       : num [1:32] 1.35e-04 3.76e-05 4.20e-05 4.17e-05 1.17e-04 ...
##   ..$ CLCdnDe10_24187       : num [1:32] 1.19e-04 3.32e-05 3.71e-05 7.37e-05 NA ...
##   ..$ TDe_comp110944_c2_seq4: num [1:32] 5.06e-05 NA 9.42e-05 4.68e-05 1.31e-04 ...
##   ..$ PBdnRY1_11085         : num [1:32] 4.44e-05 1.85e-05 4.14e-05 NA NA NA NA NA NA NA ...
##   ..$ CLCdnPW48_15164       : num [1:32] 0.000264 0.000302 0.000276 0.000275 0.000427 ...
##   ..$ CLCdnPW31_956         : num [1:32] 0.000334 0.000167 0.000145 0.000247 0.000289 ...
##   ..$ VdnDe1_162036         : num [1:32] 0.000147 0.000122 0.000091 0.000181 0.000253 ...
##   ..$ PBdnRY1_2239          : num [1:32] 1.65e-04 6.89e-05 5.78e-05 5.74e-05 1.07e-04 ...
##   ..$ CLCdnPW22_15452       : num [1:32] 1.61e-04 8.95e-05 1.00e-04 1.49e-04 NA ...
##   ..$ CLCdnPW50_2470        : num [1:32] NA NA NA NA NA ...
##   ..$ VdnPW3_37011          : num [1:32] 1.23e-04 8.57e-05 5.75e-05 1.14e-04 1.07e-04 ...
##   ..$ SdnPW1_46             : num [1:32] 1.49e-05 NA NA 1.38e-05 NA ...
##   ..$ CLCdnDe7_34340        : num [1:32] NA NA NA NA NA NA NA NA NA NA ...
##   ..$ CLCdnDe11_2286        : num [1:32] NA NA NA NA 0.000369 ...
##   ..$ VdnDe2_53978          : num [1:32] 0.000324 0.000337 0.000452 0.000524 NA ...
##   ..$ CLCdnDe7_6144         : num [1:32] 0.000176 0.000367 0.000246 0.000244 0.000228 ...
##   ..$ CLCdnRY10_6741        : num [1:32] 1.42e-04 5.91e-05 6.61e-05 6.56e-05 NA ...
##   ..$ PBdnRY1_5389          : num [1:32] 0.000708 0.000642 0.000602 0.000598 0.000878 ...
##   ..$ VdnDe2_29290          : num [1:32] 1.42e-04 9.44e-05 1.58e-04 1.83e-04 7.34e-05 ...
##   ..$ VdnDe4_115100         : num [1:32] 0.000711 0.000774 0.000547 0.000572 0.001042 ...
##   ..$ Sotub11g025210.1.1    : num [1:32] 0.000229 0.000286 0.000319 0.000212 0.000197 ...
##   ..$ VdnDe1_39992          : num [1:32] 0.00039 0.000542 0.000424 0.000361 0.000505 ...
##   ..$ VdnPW4_15664          : num [1:32] NA NA NA NA NA NA NA NA 1.66e-05 NA ...
names(X)
## [1] "status"       "hormonomics"  "metabolomics" "qPCR"        
## [5] "proteomics"

Check if sample names in all datasets match.

OK <- TRUE
for(i in 2:length(X)) {
print(ok <- all(names(X[[1]])==rownames(X[[i]])))
OK <- OK&ok
}
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE

Sample names in datasets match.

Put data into safe object DATA.

DATA <- X

We will also need the names of treatment groups.

groups <- unique(pdata$Treatment)
groups
## [1] "C" "H"

3 Multiomics data integration with DIABLO

CCDATA <- DATA
names(CCDATA)
## [1] "status"       "hormonomics"  "metabolomics" "qPCR"        
## [5] "proteomics"
write("Entering 035-DIABLO !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!", "bla.log", append=!TRUE)
write("commandArgs:", "bla.log", append=TRUE)
write(commandArgs(trailingOnly = TRUE), "bla.log",append=TRUE)
write("End commandArgs", "bla.log", append=TRUE)
out <- ""
out <- paste(out,knit_child("035-DIABLO-2.Rmd", quiet=TRUE))
cat(out)

Child: 035-DIABLO-2.Rmd ## DIABLO hormonomics, metabolomics, qPCR, proteomics ## DIABLO

DIABLO from mixOmics enables integration of more than two datasets.

3.1 Organize data

Thre datasets are organized as a list of matrices with same samples as rows and variables in columns.

data <- CCDATA[-1]
str(data)
## List of 4
##  $ hormonomics :'data.frame':    32 obs. of  12 variables:
##   ..$ IAA       : num [1:32] 37.2 45.9 47.6 37.6 49.5 ...
##   ..$ oxIAA     : num [1:32] 61.5 67.8 52.5 48.7 76.8 ...
##   ..$ IAA.Asp   : num [1:32] 2.22 1.99 1.5 2.24 1.93 ...
##   ..$ ABA       : num [1:32] 36.8 33.1 41.7 39.1 80 ...
##   ..$ PA        : num [1:32] 92 92.6 93.8 91.4 231.7 ...
##   ..$ DPA       : num [1:32] 45.6 62.1 55.1 59.8 178.1 ...
##   ..$ SA        : num [1:32] 505 519 275 628 1315 ...
##   ..$ JA        : num [1:32] 2.69 2.8 4.76 4.62 6.1 ...
##   ..$ JA.Ile    : num [1:32] 0.553 0.566 0.427 0.203 0.623 ...
##   ..$ X9.10.dhJA: num [1:32] 5.45 3.7 2.88 5.17 5.01 ...
##   ..$ X12.OH.JA : num [1:32] 16.1 24 27.9 25.7 244.1 ...
##   ..$ cisOPDA   : num [1:32] 354 403 645 750 1295 ...
##  $ metabolomics:'data.frame':    32 obs. of  22 variables:
##   ..$ Glukose : num [1:32] 2.13 2.2 0.82 2.55 4.77 6.3 7.24 3.09 6.34 9.49 ...
##   ..$ Fructose: num [1:32] 2.7 2.9 1.59 3.01 3.97 7.16 5.41 4.04 8.52 7.75 ...
##   ..$ Sucrose : num [1:32] 3.45 3.38 2.45 4.06 3.53 ...
##   ..$ Starch  : num [1:32] 22.06 12.74 9.98 15.13 16.25 ...
##   ..$ Asp     : num [1:32] 1040 844 887 988 793 ...
##   ..$ Glu     : num [1:32] 2514 1966 2068 2348 2109 ...
##   ..$ Asn     : num [1:32] 178 168 167 172 277 ...
##   ..$ Ser     : num [1:32] 598 498 441 538 368 ...
##   ..$ Gln     : num [1:32] 498 409 400 466 266 ...
##   ..$ Gly     : num [1:32] 137.8 104.6 92.1 117.2 68.7 ...
##   ..$ His     : num [1:32] 20.7 13.3 17.1 16.8 13.7 19.4 20.5 17.8 8.8 18.4 ...
##   ..$ Arg     : num [1:32] 26.9 29.1 29.5 24.9 38.3 63.9 38.6 47.3 54.7 67.2 ...
##   ..$ Thr     : num [1:32] 225 208 216 252 189 ...
##   ..$ Ala     : num [1:32] 702 496 515 653 296 ...
##   ..$ Pro     : num [1:32] 48.6 57.3 53.5 58.6 85.8 ...
##   ..$ Tyr     : num [1:32] 25.3 22.1 24.4 22.3 31.2 40.6 50.9 37.1 20.2 34.6 ...
##   ..$ Val     : num [1:32] 54.1 51.3 52.9 58.4 73.1 ...
##   ..$ Met     : num [1:32] 7.3 6.9 8.6 7.9 4.7 8.6 4.5 6.6 3.8 2.2 ...
##   ..$ Ile     : num [1:32] 48.8 47 54.3 52.9 60.4 80.6 62.7 63 29.9 48.3 ...
##   ..$ Lys     : num [1:32] 26 21.8 26.5 26.7 28.4 33.1 29.2 38.4 20.6 41.8 ...
##   ..$ Leu     : num [1:32] 18.8 15.7 19.5 20.1 27.4 27.1 21.3 26.3 38.5 45.7 ...
##   ..$ Phe     : num [1:32] 119 88 86.8 111 98.3 ...
##  $ qPCR        :'data.frame':    32 obs. of  14 variables:
##   ..$ RbohA  : num [1:32] 1.35 1.5 1.26 1.43 1.4 ...
##   ..$ SnRK2  : num [1:32] 1.5 1.63 1.63 1.53 1.12 ...
##   ..$ ACO2   : num [1:32] 0.15 0.196 0.44 0.177 0.537 ...
##   ..$ HSP70  : num [1:32] 1.013 1.067 0.883 1.042 1.028 ...
##   ..$ PR1b   : num [1:32] 0.324 0.154 0.269 0.256 0.227 ...
##   ..$ RD29B  : num [1:32] 0.017 0.0465 0.017 0.0824 1.7519 ...
##   ..$ X13.LOX: num [1:32] 0.7 0.66 0.766 0.734 1.136 ...
##   ..$ P5CS   : num [1:32] 3.648 3.265 2.152 3.156 0.863 ...
##   ..$ ERF1   : num [1:32] 0.56 0.638 0.586 0.664 1.635 ...
##   ..$ CAT1   : num [1:32] 0.64 0.635 0.687 0.705 0.811 ...
##   ..$ CO     : num [1:32] 2.42 5.56 2.74 2.47 1.71 ...
##   ..$ SWEET  : num [1:32] 0.816 1.874 0.931 0.934 1.495 ...
##   ..$ SP6A   : num [1:32] 0.122 0.239 0.33 0.122 3.386 ...
##   ..$ M0ZJG3 : num [1:32] 1.211 1.376 1.007 0.903 2.358 ...
##  $ proteomics  :'data.frame':    32 obs. of  36 variables:
##   ..$ VdnDe2_86784          : num [1:32] 0.000414 0.00023 0.00045 0.000255 0.000179 ...
##   ..$ PBdnRY1_7025          : num [1:32] 0.000469 0.000469 0.000524 0.000521 NA ...
##   ..$ VdnPW5_1541           : num [1:32] 0.002309 0.002352 0.001314 0.001662 0.000499 ...
##   ..$ CLCdnPW10_11437       : num [1:32] NA 0.000301 NA 0.000222 0.000623 ...
##   ..$ CLCdnDe2_96512        : num [1:32] NA NA NA NA NA ...
##   ..$ VdnPW1_81435          : num [1:32] 2.82e-05 2.35e-05 2.62e-05 NA NA NA NA NA NA NA ...
##   ..$ CLCdnDe6_4953         : num [1:32] 0.000387 0.000323 0.00052 0.000398 0.000223 ...
##   ..$ CLCdnDe13_5738        : num [1:32] NA NA NA NA NA NA 4.72e-05 NA NA NA ...
##   ..$ CLCdnPW12_1472        : num [1:32] 0.000768 0.000475 0.000531 0.000584 0.000289 ...
##   ..$ CLCdnPW49_8560        : num [1:32] 3.90e-05 NA 7.26e-05 3.61e-05 1.01e-04 ...
##   ..$ VdnPW4_18801          : num [1:32] 0.000203 0.000198 0.000189 0.000188 0.000439 ...
##   ..$ SdnRY1_10712          : num [1:32] 0.000272 0.000265 0.000338 0.000336 0.000471 ...
##   ..$ CLCdnDe2_311          : num [1:32] 0.000257 0.000179 0.00016 0.000238 0.000222 ...
##   ..$ CLCdnDe13_63179       : num [1:32] 1.35e-04 3.76e-05 4.20e-05 4.17e-05 1.17e-04 ...
##   ..$ CLCdnDe10_24187       : num [1:32] 1.19e-04 3.32e-05 3.71e-05 7.37e-05 NA ...
##   ..$ TDe_comp110944_c2_seq4: num [1:32] 5.06e-05 NA 9.42e-05 4.68e-05 1.31e-04 ...
##   ..$ PBdnRY1_11085         : num [1:32] 4.44e-05 1.85e-05 4.14e-05 NA NA NA NA NA NA NA ...
##   ..$ CLCdnPW48_15164       : num [1:32] 0.000264 0.000302 0.000276 0.000275 0.000427 ...
##   ..$ CLCdnPW31_956         : num [1:32] 0.000334 0.000167 0.000145 0.000247 0.000289 ...
##   ..$ VdnDe1_162036         : num [1:32] 0.000147 0.000122 0.000091 0.000181 0.000253 ...
##   ..$ PBdnRY1_2239          : num [1:32] 1.65e-04 6.89e-05 5.78e-05 5.74e-05 1.07e-04 ...
##   ..$ CLCdnPW22_15452       : num [1:32] 1.61e-04 8.95e-05 1.00e-04 1.49e-04 NA ...
##   ..$ CLCdnPW50_2470        : num [1:32] NA NA NA NA NA ...
##   ..$ VdnPW3_37011          : num [1:32] 1.23e-04 8.57e-05 5.75e-05 1.14e-04 1.07e-04 ...
##   ..$ SdnPW1_46             : num [1:32] 1.49e-05 NA NA 1.38e-05 NA ...
##   ..$ CLCdnDe7_34340        : num [1:32] NA NA NA NA NA NA NA NA NA NA ...
##   ..$ CLCdnDe11_2286        : num [1:32] NA NA NA NA 0.000369 ...
##   ..$ VdnDe2_53978          : num [1:32] 0.000324 0.000337 0.000452 0.000524 NA ...
##   ..$ CLCdnDe7_6144         : num [1:32] 0.000176 0.000367 0.000246 0.000244 0.000228 ...
##   ..$ CLCdnRY10_6741        : num [1:32] 1.42e-04 5.91e-05 6.61e-05 6.56e-05 NA ...
##   ..$ PBdnRY1_5389          : num [1:32] 0.000708 0.000642 0.000602 0.000598 0.000878 ...
##   ..$ VdnDe2_29290          : num [1:32] 1.42e-04 9.44e-05 1.58e-04 1.83e-04 7.34e-05 ...
##   ..$ VdnDe4_115100         : num [1:32] 0.000711 0.000774 0.000547 0.000572 0.001042 ...
##   ..$ Sotub11g025210.1.1    : num [1:32] 0.000229 0.000286 0.000319 0.000212 0.000197 ...
##   ..$ VdnDe1_39992          : num [1:32] 0.00039 0.000542 0.000424 0.000361 0.000505 ...
##   ..$ VdnPW4_15664          : num [1:32] NA NA NA NA NA NA NA NA 1.66e-05 NA ...
length(data)
## [1] 4
cimfn <- "cim.png"

In addition, outcome, phenotypic state or in our case treatment can also be determined. We have combination of two treatments and four time points.

state <- factor(CCDATA[[1]])
table(state)
## state
## C01 C07 C08 C14 H01 H07 H08 H14 
##   4   4   4   4   4   4   4   4
str(state)
##  Factor w/ 8 levels "C01","C07","C08",..: 1 1 1 1 2 2 2 2 3 3 ...
##  - attr(*, "names")= chr [1:32] "C_S1_1" "C_S1_2" "C_S1_3" "C_S1_4" ...

3.2 Initial analysis

Note: Additional insights can be found at http://mixomics.org/mixdiablo/diablo-tcga-case-study/.

3.2.1 Pairwise PLS comparisons

list.keepX = c(25, 25) # select arbitrary values of features to keep
list.keepY = c(25, 25)
par(mfrow=c(2,2))
pairs <- combn(1:length(names(data)),2)
nms <- names(data)
outn <- ""
j <- 1
ncomp <- length(data)
cols <- c('orange1', 'lightgreen', "red")
if(length(data)==3) pick <- 1:3 else pick <- c(4,1:3)
cols <- c('orange1', 'brown1', 'lightgreen',"lightblue")[pick]
pchs <- c(16, 17, 15, 18)[pick]
j <- 1
cutoff <- 0.5
for(j in 1:ncol(pairs) ){
    pair <- pairs[,j]
    pair
    X <- CCDATA[[pair[1]+1]]
    Y <- CCDATA[[pair[2]+1]]
    list.keepX <- rep(min(ncol(X), 25), ncomp)
    list.keepY <- rep(min(ncol(Y), 25), ncomp)
    x <- spls(X, Y, ncomp=ncomp, keepX = list.keepX, keepY = list.keepY)
    assign(paste0("spls",j),x)
    cat("\n",paste(nms[pair]), "\n")
    cat("Results in:",paste0("spls",j),"\n")
    cat("Correlation between pls variates:\n")
    print(round(cor(x$variates$X, x$variates$Y),5))
#
   plotVar(x, cutoff = cutoff, title = paste(nms[pair],collapse=", "),
        legend = c(nms[pair][1], nms[pair][2]),
        var.names = FALSE, style = 'graphics',
        pch = pchs[pair], cex = c(2,2),
        col = cols[pair])
}
## 
##  hormonomics metabolomics 
## Results in: spls1 
## Correlation between pls variates:
##         comp1    comp2    comp3   comp4
## comp1 0.91126  0.00000  0.00000 0.00000
## comp2 0.13029  0.80764  0.00000 0.00000
## comp3 0.00165 -0.00098  0.61826 0.00000
## comp4 0.01098  0.15104 -0.19116 0.70185
## 
##  hormonomics qPCR 
## Results in: spls2 
## Correlation between pls variates:
##          comp1   comp2   comp3   comp4
## comp1  0.69330 0.00000 0.00000 0.00000
## comp2 -0.40812 0.63635 0.00000 0.00000
## comp3 -0.31053 0.39759 0.54900 0.00000
## comp4 -0.06313 0.13686 0.15843 0.56643
## 
##  hormonomics proteomics 
## Results in: spls3 
## Correlation between pls variates:
##          comp1    comp2   comp3    comp4
## comp1  0.81155 -0.00114 0.00016  0.02427
## comp2  0.37042  0.88279 0.00120 -0.01077
## comp3 -0.02635  0.12354 0.80974  0.01161
## comp4 -0.10965 -0.12524 0.05118  0.70884
## 
##  metabolomics qPCR 
## Results in: spls4 
## Correlation between pls variates:
##          comp1    comp2    comp3   comp4
## comp1  0.74472  0.00000  0.00000 0.00000
## comp2  0.38516  0.60146  0.00000 0.00000
## comp3  0.27693  0.40129  0.61882 0.00000
## comp4 -0.18839 -0.26407 -0.31216 0.42881
Circle Correlation Plots for pairwise PLS models on ADAPT data. At most top 25 features for each dimension with correlation above 0.5, are displayed.

Circle Correlation Plots for pairwise PLS models on ADAPT data. At most top 25 features for each dimension with correlation above 0.5, are displayed.

## 
##  metabolomics proteomics 
## Results in: spls5 
## Correlation between pls variates:
##          comp1    comp2    comp3   comp4
## comp1  0.79399 -0.00814 -0.00062 0.01063
## comp2  0.18909  0.79747  0.00972 0.00378
## comp3 -0.15076  0.06283  0.67921 0.00642
## comp4  0.21992  0.30803  0.09922 0.74192
## 
##  qPCR proteomics 
## Results in: spls6 
## Correlation between pls variates:
##         comp1    comp2    comp3    comp4
## comp1 0.73778 -0.01324  0.00610  0.00301
## comp2 0.12129  0.72044 -0.02926 -0.01731
## comp3 0.32872  0.29495  0.65660 -0.00235
## comp4 0.09182  0.22973  0.32202  0.69532
Circle Correlation Plots for pairwise PLS models on ADAPT data. At most top 25 features for each dimension with correlation above 0.5, are displayed.

Circle Correlation Plots for pairwise PLS models on ADAPT data. At most top 25 features for each dimension with correlation above 0.5, are displayed.

3.2.2 Initial DIABLO model

Following the suggestion in the source, we will use design matrices with small values. This is supposed to keep low classification error rate.

entry <- .entry
design = matrix(entry, ncol = length(data), nrow = length(data),
                dimnames = list(names(data), names(data)))
diag(design) = 0 # set diagonal to 0s

design
##              hormonomics metabolomics qPCR proteomics
## hormonomics          0.0          0.5  0.5        0.5
## metabolomics         0.5          0.0  0.5        0.5
## qPCR                 0.5          0.5  0.0        0.5
## proteomics           0.5          0.5  0.5        0.0

With a design in place, the initial DIABLO model can be generated. An arbitrarily high number of components (ncomp = 5) will be used.

# form basic DIABLO model
Y <- state
basic.diablo.model = block.splsda(X = data, Y = Y, ncomp = 5, design = design)
## Design matrix has changed to include Y; each block will be
##             linked to Y.
######################################## test eval
knitr::opts_chunk$set(eval=!FALSE)

3.2.3 Tuning the number of components

Details of tuning process can be found in the http://mixomics.org/mixdiablo/diablo-tcga-case-study/.

The process can be computer time consuming and was performed separately.

%% To choose the number of components for the final DIABLO model, the function perf() is run with 3-fold cross-validation repeated 10 times. Fold number should be smaller than minimal number of samples in groups. %% %% %% {r eval=FALSE} %% # Tuning of features can take a substantial amount of time. %% # Chunk not evaluated for this document. %% # %% # run component number tuning with repeated CV %% system.time(perf.diablo = perf(basic.diablo.model, validation = 'Mfold', %% folds = 3, nrepeat = 10)) %% %% plot(perf.diablo) # plot output of tuning %% %% %% %% %% {r eval=FALSE} %% # Tuning of features can take a substantial amount of time. %% # Chunk not evaluated for this document. %% # %% # set the optimal ncomp value %% ncomp <- perf.diablo$choice.ncomp$WeightedVote["Overall.BER", "centroids.dist"] %% # show the optimal choice for ncomp for each dist metric %% perf.diablo$choice.ncomp$WeightedVote %% %% %% For classification, the analysis suggests %% the number of components.

From previous tuning sessions one can conclude, that the classification rate stays roughly unchanged after two to four components, so we will set the number of components to the number of data sets:

ncomp <- length(data)
ncomp
## [1] 4

3.2.4 Tuning the number of features

We choose the optimal number of variables to select in each data set using the tune.block.splsda() function, for a grid of keepX values for each type of omics. Note that the function has been set to favour a relatively small signature while allowing us to obtain a sufficient number of variables for downstream validation and/or interpretation. See ?tune.block.splsda.

%% The function tune is run with 10-fold cross validation, but repeated only once. Note that for a more thorough tuning process, provided sufficient computational time, we could increase the nrepeat argument. Here we have saved the results into an RData object that is available for download as the tuning can take a very long time, especially on lower end machines. %% %% %% {r } %% x <- list() %% for (i in 1:length(data)){ %% x[[i]] <- c( seq(5,min(30, ncol(data[[i]])) ,5)) %% } %% names(x) <- names(data) %% test.keepX <- x %% test.keepX %% #list (c(5:9, seq(10, 18, 2), seq(20,30,5)), %% # c(5:9, seq(10, 18, 2), seq(20,30,5)), %% # c(5:9, seq(10, 18, 2), seq(20,30,5))) %% %% %% %% %% {r eval=FALSE} %% # Tuning of features can take a substantial amount of time. %% # Chunk not evaluated for this document. %% # %% # run the feature selection tuning %% system.time(tune.model <- tune.block.splsda(X = data, Y = Y, ncomp = ncomp, cpus=4, %% test.keepX = test.keepX, design = design, %% validation = 'Mfold', folds = 3, nrepeat = 1, %% dist = "centroids.dist") %% ) %% %% %% {r eval=FALSE} %% # Tuning of features can take a substantial amount of time. %% # Chunk not evaluated for this document. %% # %% # run the feature selection tuning %% system.time(tune.model <- tune.block.splsda(X = data, Y = Y, ncomp = ncomp, cpus=4, %% test.keepX = test.keepX, design = design, %% validation = 'loo', folds = 3, nrepeat = 1, %% dist = "centroids.dist") %% ) %% %% %% The number of features to select on each component is returned in %% {r eval=FALSE} %% # Tuning of features can take a substantial amount of time. %% # Chunk not evaluated for this document. %% # %% list.keepX = tune.model$choice.keepX # set the optimal values of features to retain %% list.keepX %%

Previous analyses suggest the following list:

$metabolomics
[1] 10 10 5
$hormonomics
[1] 5 5 10
$qPCR
[1] 10 10 5

We have decided to keep 10 variates for each component.

if(FALSE) keepX <- list(
    metabolomics = rep(10, ncomp),
    hormonomics = rep(10, ncomp),
    qPCR = rep(10, ncomp)
    )
list.keepX = list()
for (i in 1:length(data)) list.keepX[[i]] <- rep(10, ncomp)
names(list.keepX) <- names(data)
list.keepX
## $hormonomics
## [1] 10 10 10 10
## 
## $metabolomics
## [1] 10 10 10 10
## 
## $qPCR
## [1] 10 10 10 10
## 
## $proteomics
## [1] 10 10 10 10

3.3 Final model

The final DIABLO model is run as:

# set the optimised DIABLO model
final.diablo.model = block.splsda(X = data, Y = as.factor(state)
                          ,  ncomp = ncomp
                          , keepX = list.keepX
                          , design = design)
## Design matrix has changed to include Y; each block will be
##             linked to Y.

The selected variables can be extracted with the function selectVar(), for example in each block, as seen below. Note that the stability of selected variables can be extracted from the output of the perf() function.

# the features selected from components
for (comp in 1:ncomp){
cat("\nComponent ", comp,":\n")
for(i in 1:length(data)){
cat(names(data)[i],"\n")
print(selectVar(final.diablo.model, comp = comp)[[i]]$name)
}
}
## 
## Component  1 :
## hormonomics 
##  [1] "DPA"        "SA"         "PA"         "X12.OH.JA"  "JA.Ile"    
##  [6] "X9.10.dhJA" "ABA"        "IAA"        "IAA.Asp"    "oxIAA"     
## metabolomics 
##  [1] "Glukose"  "Tyr"      "Fructose" "Ile"      "Val"      "His"     
##  [7] "Lys"      "Gln"      "Pro"      "Leu"     
## qPCR 
##  [1] "X13.LOX" "CAT1"    "M0ZJG3"  "PR1b"    "HSP70"   "SWEET"  
##  [7] "SnRK2"   "RbohA"   "SP6A"    "CO"     
## proteomics 
##  [1] "PBdnRY1_2239"   "VdnDe1_39992"   "PBdnRY1_5389"  
##  [4] "CLCdnPW49_8560" "CLCdnDe6_4953"  "SdnPW1_46"     
##  [7] "VdnPW3_37011"   "PBdnRY1_7025"   "VdnDe2_29290"  
## [10] "VdnDe4_115100" 
## 
## Component  2 :
## hormonomics 
##  [1] "ABA"        "oxIAA"      "IAA"        "cisOPDA"    "PA"        
##  [6] "SA"         "JA.Ile"     "IAA.Asp"    "JA"         "X9.10.dhJA"
## metabolomics 
##  [1] "Starch"  "Ser"     "Met"     "Asn"     "Arg"     "Sucrose"
##  [7] "Gly"     "His"     "Pro"     "Ala"    
## qPCR 
##  [1] "SP6A"   "PR1b"   "SnRK2"  "RD29B"  "CO"     "P5CS"   "M0ZJG3"
##  [8] "ERF1"   "HSP70"  "RbohA" 
## proteomics 
##  [1] "CLCdnPW31_956"          "CLCdnPW48_15164"       
##  [3] "CLCdnPW22_15452"        "CLCdnRY10_6741"        
##  [5] "VdnPW4_18801"           "CLCdnDe7_6144"         
##  [7] "TDe_comp110944_c2_seq4" "CLCdnDe2_311"          
##  [9] "VdnDe1_162036"          "VdnDe2_29290"          
## 
## Component  3 :
## hormonomics 
##  [1] "JA.Ile"     "JA"         "IAA.Asp"    "oxIAA"      "X9.10.dhJA"
##  [6] "DPA"        "PA"         "ABA"        "X12.OH.JA"  "SA"        
## metabolomics 
##  [1] "Met" "Gln" "Leu" "Ala" "Phe" "Arg" "Gly" "Glu" "Asn" "Ile"
## qPCR 
##  [1] "ACO2"    "RD29B"   "PR1b"    "SnRK2"   "M0ZJG3"  "X13.LOX"
##  [7] "CO"      "ERF1"    "HSP70"   "SP6A"   
## proteomics 
##  [1] "CLCdnPW12_1472"     "VdnDe2_53978"       "Sotub11g025210.1.1"
##  [4] "SdnRY1_10712"       "CLCdnDe7_6144"      "CLCdnDe2_311"      
##  [7] "CLCdnPW22_15452"    "CLCdnPW48_15164"    "PBdnRY1_5389"      
## [10] "CLCdnPW49_8560"    
## 
## Component  4 :
## hormonomics 
##  [1] "IAA"       "X12.OH.JA" "SA"        "JA"        "cisOPDA"  
##  [6] "oxIAA"     "ABA"       "PA"        "IAA.Asp"   "JA.Ile"   
## metabolomics 
##  [1] "Asp"      "Ala"      "Pro"      "Leu"      "Sucrose"  "Glu"     
##  [7] "Asn"      "Fructose" "Gln"      "Tyr"     
## qPCR 
##  [1] "RbohA"   "RD29B"   "P5CS"    "M0ZJG3"  "ACO2"    "SP6A"   
##  [7] "CO"      "HSP70"   "X13.LOX" "SWEET"  
## proteomics 
##  [1] "VdnDe4_115100"          "VdnDe2_53978"          
##  [3] "CLCdnPW10_11437"        "VdnPW5_1541"           
##  [5] "VdnPW4_18801"           "TDe_comp110944_c2_seq4"
##  [7] "CLCdnDe13_63179"        "PBdnRY1_5389"          
##  [9] "CLCdnDe10_24187"        "CLCdnPW12_1472"

3.4 Sample plots

plotDIABLO() is a diagnostic plot to check whether the correlation between components from each data set has been maximised as specified in the design matrix. We specify which dimension to be assessed with the ncomp argument.

for(comp in 1:ncomp){
plotDiablo(final.diablo.model, ncomp = comp)
title(paste("Component",comp), adj=0.1, line=-1, outer=TRUE)
}

The sample plot with the plotIndiv() function projects each sample into the space spanned by the components of each block. Clustering of the samples can be assessed with this plot.

plind <- plotIndiv(final.diablo.model, 
          ind.names = FALSE, 
          legend = TRUE,
          title = 'DIABLO Sample Plots',
          guide="none",
          ellipse = TRUE
          )
## Warning: It is deprecated to specify `guide = FALSE` to remove a
## guide. Please use `guide = "none"` instead.

In the arrow plot below, the start of the arrow indicates the centroid between all data sets for a given sample and the tips of the arrows indicate the location of that sample in each block. Such graphics highlight the agreement between all data sets at the sample level. While somewhat difficult to interpret, even qualitatively, this arrow plot shows proximities of C01 and H01 (both day 1), C07 and C08, and H07 and H08 ( both a day apart). While C samples are in forth quadrant ( D1 < 0, D2 > 0), H samples have ( D1 < 0, D2 < 0) except H14 that is separated on the positive part of Dimension 1.

plotArrow(final.diablo.model, ind.names = FALSE, legend = TRUE,
          title = paste(groups,collapse=", ")

          )

3.5 Variable plots

Several graphical outputs are available to visualise and mine the associations between the selected variables.

The best starting point to evaluate the correlation structure between variables is with the correlation circle plot. A majority of the qPCR variables are positively correlated only with the first component. The hormonomics and metabolomics variables seem to separate along first two dimensions. These first two components correlate highly with the selected variables from the all three dataset. From this, the correlation of each selected feature from all three datasets can be evaluated based on their proximity.

#if(length(data)==3) pick <- 1:3 else pick <- c(4,1:3)
#cols <- c('orange1', 'brown1', 'lightgreen',"lightblue")[pick]
#pchs <- c(16, 17, 15, 18)[pick]
cols <- c('orange1', 'brown1', 'lightgreen','lightblue')[1:length(data)]
pchs <- c(16, 17, 15, 14)[1:length(data)]
plotVar(final.diablo.model, var.names = FALSE,
        style = 'graphics', legend = TRUE
        , pch = pchs, cex = rep(2,length(data))
        , col = cols
)

The circos plot is exclusive to integrative frameworks and represents the correlations between variables of different types, represented on the side quadrants. It seems that the qPCR variables are almost entirely negatively correlated with the other two dataframes. Just few from metabolomics and hormonomics are positively correlated. Note that these correlations are above a value of 0.7 (cutoff = 0.7). All interpretations made above are only relevant for features with very strong correlations.

Plot variables

#plotVar(res, cutoff=0.5, legend = TRUE, overlap=!FALSE, style='graphics')
#plotVar(res, cutoff=0.5, legend = TRUE, overlap=FALSE, style='graphics')
plotVar(final.diablo.model, cutoff=0.5, legend = TRUE, comp=c(1,2), overlap=FALSE, style='ggplot2', col=cols)

plotVar(final.diablo.model, cutoff=0.5, legend = TRUE, comp=c(2,3), overlap=FALSE, col=cols)

circosPlot(final.diablo.model, cutoff = 0.7, line = TRUE,
           color.blocks= cols,
           color.cor = c(3,2), size.labels = 1
           , xpd=TRUE)

3.6 Relevance networks

Another visualisation of the correlations between the different types of variables is the relevance network, which is also built on the similarity matrix (as is the circos plot). Colour represent variable type.

blocks <- combn(length(data),2)
j <- 1
cutoff <- 0.8
out35a <- ""
for(j in 1:ncol(blocks)){
    out35a <- paste( out35a, knit_child("035a-DIABLO-network.Rmd", quiet=!TRUE))

        if(interactive()) readline()
}
cat(out35a)

3.6.1 hormonomics and metabolomics

nfn <- paste0("network-035a-",paste(names(data)[blocks[,j]], collapse="-"),"-",cutoff*10)
#nfn <- paste0("network-035a-",j,"-",cutoff*10)
nfn
## [1] "network-035a-hormonomics-metabolomics-8"
write(nfn, "bla.log", append=TRUE)
png(paste0(nfn,".png"), res = 600, width = 4000, height = 4000)
nw <- network(final.diablo.model
        , blocks = blocks[,j]
        , color.node = cols[blocks[,j]]
        , cutoff = cutoff
        , shape.node = "rectangle"
        , save = "png"
       , name.save = nfn
        )
#title(main=paste(names(data)[blocks[,j]], sep=", "),
#sub=paste("Cutoff = ",cutoff))
#        
#dev.off()

Cutoff = 0.8

network-035a-hormonomics-metabolomics-8

3.6.2 hormonomics and qPCR

nfn <- paste0("network-035a-",paste(names(data)[blocks[,j]], collapse="-"),"-",cutoff*10)
#nfn <- paste0("network-035a-",j,"-",cutoff*10)
nfn
## [1] "network-035a-hormonomics-qPCR-8"
write(nfn, "bla.log", append=TRUE)
png(paste0(nfn,".png"), res = 600, width = 4000, height = 4000)
nw <- network(final.diablo.model
        , blocks = blocks[,j]
        , color.node = cols[blocks[,j]]
        , cutoff = cutoff
        , shape.node = "rectangle"
        , save = "png"
       , name.save = nfn
        )
#title(main=paste(names(data)[blocks[,j]], sep=", "),
#sub=paste("Cutoff = ",cutoff))
#        
#dev.off()

Cutoff = 0.8

network-035a-hormonomics-qPCR-8

3.6.3 hormonomics and proteomics

nfn <- paste0("network-035a-",paste(names(data)[blocks[,j]], collapse="-"),"-",cutoff*10)
#nfn <- paste0("network-035a-",j,"-",cutoff*10)
nfn
## [1] "network-035a-hormonomics-proteomics-8"
write(nfn, "bla.log", append=TRUE)
png(paste0(nfn,".png"), res = 600, width = 4000, height = 4000)
nw <- network(final.diablo.model
        , blocks = blocks[,j]
        , color.node = cols[blocks[,j]]
        , cutoff = cutoff
        , shape.node = "rectangle"
        , save = "png"
       , name.save = nfn
        )
#title(main=paste(names(data)[blocks[,j]], sep=", "),
#sub=paste("Cutoff = ",cutoff))
#        
#dev.off()

Cutoff = 0.8

network-035a-hormonomics-proteomics-8

3.6.4 metabolomics and qPCR

nfn <- paste0("network-035a-",paste(names(data)[blocks[,j]], collapse="-"),"-",cutoff*10)
#nfn <- paste0("network-035a-",j,"-",cutoff*10)
nfn
## [1] "network-035a-metabolomics-qPCR-8"
write(nfn, "bla.log", append=TRUE)
png(paste0(nfn,".png"), res = 600, width = 4000, height = 4000)
nw <- network(final.diablo.model
        , blocks = blocks[,j]
        , color.node = cols[blocks[,j]]
        , cutoff = cutoff
        , shape.node = "rectangle"
        , save = "png"
       , name.save = nfn
        )
#title(main=paste(names(data)[blocks[,j]], sep=", "),
#sub=paste("Cutoff = ",cutoff))
#        
#dev.off()

Cutoff = 0.8

network-035a-metabolomics-qPCR-8

3.6.5 metabolomics and proteomics

nfn <- paste0("network-035a-",paste(names(data)[blocks[,j]], collapse="-"),"-",cutoff*10)
#nfn <- paste0("network-035a-",j,"-",cutoff*10)
nfn
## [1] "network-035a-metabolomics-proteomics-8"
write(nfn, "bla.log", append=TRUE)
png(paste0(nfn,".png"), res = 600, width = 4000, height = 4000)
nw <- network(final.diablo.model
        , blocks = blocks[,j]
        , color.node = cols[blocks[,j]]
        , cutoff = cutoff
        , shape.node = "rectangle"
        , save = "png"
       , name.save = nfn
        )
#title(main=paste(names(data)[blocks[,j]], sep=", "),
#sub=paste("Cutoff = ",cutoff))
#        
#dev.off()

Cutoff = 0.8

network-035a-metabolomics-proteomics-8

3.6.6 qPCR and proteomics

nfn <- paste0("network-035a-",paste(names(data)[blocks[,j]], collapse="-"),"-",cutoff*10)
#nfn <- paste0("network-035a-",j,"-",cutoff*10)
nfn
## [1] "network-035a-qPCR-proteomics-8"
write(nfn, "bla.log", append=TRUE)
png(paste0(nfn,".png"), res = 600, width = 4000, height = 4000)
nw <- network(final.diablo.model
        , blocks = blocks[,j]
        , color.node = cols[blocks[,j]]
        , cutoff = cutoff
        , shape.node = "rectangle"
        , save = "png"
       , name.save = nfn
        )
#title(main=paste(names(data)[blocks[,j]], sep=", "),
#sub=paste("Cutoff = ",cutoff))
#        
#dev.off()

Cutoff = 0.8

network-035a-qPCR-proteomics-8

3.7 Multipartite network

cutoff <- 0.0
x <- final.diablo.model
layout.fun <- NULL
label <- paste(.treat, collapse=", ")
out35b <- ""
  out35b <- paste( out35b, knit_child("035b-multipartite-network.Rmd", quiet=TRUE))
cat(out35b)

3.7.1 Cutoff = 0

ndata <- length(data)
lbl <- gsub(", ","-",label)
nfn <- paste("network-035b",lbl,cutoff*10,sep="-")
#png(nfn, res = 600, width = 4000, height = 4000)
write(nfn, "bla.log", append=TRUE)
set.seed(1234)
nw <- my.network(x
        , blocks = 1:ndata
        , color.node = cols
        , cutoff = cutoff
        , shape.node = "rectangle"
        , layout = layout.fun
        , save = "png"
        , name.save = nfn
        )
#        title( #main=paste(names(data), sep=", "),
#        sub=paste("Cutoff = ",cutoff))
#        title(label,adj=0.8,outer=TRUE,line=-1)
#        legend("bottomright", pch=15,pt.cex=2,col=cols, legend=names(data),
#        bty="n")
#        text(ly[,1],ly[,2],names(V(nw$gR)))
#dev.off()

network-035b-C-H-0

# Save network layout in ly, used by my.layout function.
if(exists(deparse(substitute(nw)))) ly <- nw$layout else ly <- NULL
cutoff <- 0.8
x <- final.diablo.model
label <- paste(.treat, collapse=", ")
out35b <- ""
  out35b <- paste( out35b, knit_child("035b-multipartite-network.Rmd", quiet=TRUE))
cat(out35b)

3.7.2 Cutoff = 0.8

ndata <- length(data)
lbl <- gsub(", ","-",label)
nfn <- paste("network-035b",lbl,cutoff*10,sep="-")
#png(nfn, res = 600, width = 4000, height = 4000)
write(nfn, "bla.log", append=TRUE)
set.seed(1234)
nw <- my.network(x
        , blocks = 1:ndata
        , color.node = cols
        , cutoff = cutoff
        , shape.node = "rectangle"
        , layout = layout.fun
        , save = "png"
        , name.save = nfn
        )
#        title( #main=paste(names(data), sep=", "),
#        sub=paste("Cutoff = ",cutoff))
#        title(label,adj=0.8,outer=TRUE,line=-1)
#        legend("bottomright", pch=15,pt.cex=2,col=cols, legend=names(data),
#        bty="n")
#        text(ly[,1],ly[,2],names(V(nw$gR)))
#dev.off()

network-035b-C-H-8

3.8 Visualize loadings

The function “plotLoadings” visualises the loading weights of each selected variable on each component and each data set. The colour indicates the class in which the variable has the maximum level of expression “contrib = ‘max’” using the median “method = ‘median’”. Figure below depicts the loading values for each dimension.

for(i in 1:ncomp)
plotLoadings(final.diablo.model, comp = i, contrib = 'max', method = 'median')

3.9 Clustered image map ( Heatmap )

The cimDIABLO() function is a clustered image map specifically implemented to represent the multi-omics molecular signature expression for each sample. From figure below the areas of homogeneous expression levels for a set of samples across a set of features can be determined. For instance, the H14 samples were the only group to show extremely high levels of expression for a specific set of genes and metabolites. This indicates these features are fairly discriminating for this subtype.

cimfn <- "cim.png"
png(cimfn, res = 600, width = 4000, height = 4000)
cimDiablo(final.diablo.model, size.legend=0.7)
dev.off()
## pdf 
##   2

cim.png

3.10 AUC and ROC plots

An AUC plot per block can also be obtained using the function auroc(). The interpretation of this output may not be particularly insightful in relation to the performance evaluation of our methods, but can complement the statistical analysis.

par(mfrow=c(2,2))
for(i in 1:length(data))
auc.splsda = auroc(final.diablo.model, roc.block = names(data[i]),
                   roc.comp = 1, print = FALSE)

Save finil DIABLO model for future use in networks.

res <- final.diablo.model

Estimate classification error rate. The error rate should drop by more components used.

# run component number tuning with repeated CV
system.time(perf.diablo  <-  perf(res, validation = 'Mfold',
                   folds = 3, nrepeat = 10))
##    user  system elapsed 
##   14.44    0.36   14.95
plot(perf.diablo) # plot output of tuning

3.11 Names of kept variables

# the features selected to form components
for (comp in 1:ncomp){
cat("\nComponent ", comp,":\n")
for(i in 1:length(data)){
cat(names(data)[i],"\n")
print(selectVar(res, comp = comp)[[i]]$name)
}
}
## 
## Component  1 :
## hormonomics 
##  [1] "DPA"        "SA"         "PA"         "X12.OH.JA"  "JA.Ile"    
##  [6] "X9.10.dhJA" "ABA"        "IAA"        "IAA.Asp"    "oxIAA"     
## metabolomics 
##  [1] "Glukose"  "Tyr"      "Fructose" "Ile"      "Val"      "His"     
##  [7] "Lys"      "Gln"      "Pro"      "Leu"     
## qPCR 
##  [1] "X13.LOX" "CAT1"    "M0ZJG3"  "PR1b"    "HSP70"   "SWEET"  
##  [7] "SnRK2"   "RbohA"   "SP6A"    "CO"     
## proteomics 
##  [1] "PBdnRY1_2239"   "VdnDe1_39992"   "PBdnRY1_5389"  
##  [4] "CLCdnPW49_8560" "CLCdnDe6_4953"  "SdnPW1_46"     
##  [7] "VdnPW3_37011"   "PBdnRY1_7025"   "VdnDe2_29290"  
## [10] "VdnDe4_115100" 
## 
## Component  2 :
## hormonomics 
##  [1] "ABA"        "oxIAA"      "IAA"        "cisOPDA"    "PA"        
##  [6] "SA"         "JA.Ile"     "IAA.Asp"    "JA"         "X9.10.dhJA"
## metabolomics 
##  [1] "Starch"  "Ser"     "Met"     "Asn"     "Arg"     "Sucrose"
##  [7] "Gly"     "His"     "Pro"     "Ala"    
## qPCR 
##  [1] "SP6A"   "PR1b"   "SnRK2"  "RD29B"  "CO"     "P5CS"   "M0ZJG3"
##  [8] "ERF1"   "HSP70"  "RbohA" 
## proteomics 
##  [1] "CLCdnPW31_956"          "CLCdnPW48_15164"       
##  [3] "CLCdnPW22_15452"        "CLCdnRY10_6741"        
##  [5] "VdnPW4_18801"           "CLCdnDe7_6144"         
##  [7] "TDe_comp110944_c2_seq4" "CLCdnDe2_311"          
##  [9] "VdnDe1_162036"          "VdnDe2_29290"          
## 
## Component  3 :
## hormonomics 
##  [1] "JA.Ile"     "JA"         "IAA.Asp"    "oxIAA"      "X9.10.dhJA"
##  [6] "DPA"        "PA"         "ABA"        "X12.OH.JA"  "SA"        
## metabolomics 
##  [1] "Met" "Gln" "Leu" "Ala" "Phe" "Arg" "Gly" "Glu" "Asn" "Ile"
## qPCR 
##  [1] "ACO2"    "RD29B"   "PR1b"    "SnRK2"   "M0ZJG3"  "X13.LOX"
##  [7] "CO"      "ERF1"    "HSP70"   "SP6A"   
## proteomics 
##  [1] "CLCdnPW12_1472"     "VdnDe2_53978"       "Sotub11g025210.1.1"
##  [4] "SdnRY1_10712"       "CLCdnDe7_6144"      "CLCdnDe2_311"      
##  [7] "CLCdnPW22_15452"    "CLCdnPW48_15164"    "PBdnRY1_5389"      
## [10] "CLCdnPW49_8560"    
## 
## Component  4 :
## hormonomics 
##  [1] "IAA"       "X12.OH.JA" "SA"        "JA"        "cisOPDA"  
##  [6] "oxIAA"     "ABA"       "PA"        "IAA.Asp"   "JA.Ile"   
## metabolomics 
##  [1] "Asp"      "Ala"      "Pro"      "Leu"      "Sucrose"  "Glu"     
##  [7] "Asn"      "Fructose" "Gln"      "Tyr"     
## qPCR 
##  [1] "RbohA"   "RD29B"   "P5CS"    "M0ZJG3"  "ACO2"    "SP6A"   
##  [7] "CO"      "HSP70"   "X13.LOX" "SWEET"  
## proteomics 
##  [1] "VdnDe4_115100"          "VdnDe2_53978"          
##  [3] "CLCdnPW10_11437"        "VdnPW5_1541"           
##  [5] "VdnPW4_18801"           "TDe_comp110944_c2_seq4"
##  [7] "CLCdnDe13_63179"        "PBdnRY1_5389"          
##  [9] "CLCdnDe10_24187"        "CLCdnPW12_1472"

One would like to reduce the number of nodes, especially for proteomics data. One option is to reduce datasets in a way to keep only the variables in the selectVars in original data in . We will keep variables from the first two components.

keptVars <- unique(c(
 selectVar(res, comp=1)[[1]]$name
,selectVar(res, comp=2)[[1]]$name
)
)
which(keptVars%in%selectVar(res, comp=1)[[1]]$name)
##  [1]  1  2  3  4  5  6  7  8  9 10
which(keptVars%in%selectVar(res, comp=2)[[1]]$name)
##  [1]  2  3  5  6  7  8  9 10 11 12

3.12 Visualise variables

Loadings

sapply(res$loadings, head, 30)
## $hormonomics
##                  comp1        comp2       comp3       comp4
## IAA         0.13144028 -0.427084773  0.00000000  0.69741594
## oxIAA       0.02140750  0.523958722 -0.33911760  0.11122971
## IAA.Asp    -0.02843189 -0.072761300 -0.41420716  0.03371556
## ABA         0.18150873  0.547287697  0.12216297  0.10508803
## PA          0.37411904  0.225079925  0.15895800 -0.10029499
## DPA         0.54194363  0.000000000  0.16932019  0.00000000
## SA          0.43792876 -0.196807579  0.02231204 -0.37591784
## JA          0.00000000 -0.071436939  0.51427598  0.35624894
## JA.Ile      0.32214610 -0.112915808 -0.56765511  0.01220992
## X9.10.dhJA  0.31327382  0.008634329 -0.22625524  0.00000000
## X12.OH.JA   0.34805635  0.000000000  0.07854647  0.44190515
## cisOPDA     0.00000000  0.361836786  0.00000000  0.12386635
## 
## $metabolomics
##                 comp1       comp2        comp3        comp4
## Glukose   0.421348033  0.00000000  0.000000000  0.000000000
## Fructose  0.396648792  0.00000000  0.000000000 -0.059230034
## Sucrose   0.000000000  0.17836618  0.000000000  0.288202131
## Starch    0.000000000  0.58212188  0.000000000  0.000000000
## Asp       0.000000000  0.00000000  0.000000000  0.615461006
## Glu       0.000000000  0.00000000 -0.062051591  0.214702816
## Asn       0.000000000 -0.33431430 -0.024008115  0.198455665
## Ser       0.000000000  0.44236521  0.000000000  0.000000000
## Gln      -0.156450966  0.00000000  0.537274133 -0.020087937
## Gly       0.000000000  0.17277876  0.073284686  0.000000000
## His       0.313179595 -0.16364832  0.000000000  0.000000000
## Arg       0.000000000 -0.21865289 -0.096045583  0.000000000
## Thr       0.000000000  0.00000000  0.000000000  0.000000000
## Ala       0.000000000 -0.04719821  0.320377830 -0.434979949
## Pro       0.046630023  0.15483596  0.000000000  0.379588688
## Tyr       0.396813851  0.00000000  0.000000000  0.007278025
## Val       0.377558612  0.00000000  0.000000000  0.000000000
## Met       0.000000000 -0.43728982  0.621303077  0.000000000
## Ile       0.390337716  0.00000000  0.007304342  0.000000000
## Lys       0.296601576  0.00000000  0.000000000  0.000000000
## Leu       0.007421809  0.00000000 -0.428797067 -0.339713868
## Phe       0.000000000  0.00000000  0.140479789  0.000000000
## 
## $qPCR
##              comp1       comp2       comp3       comp4
## RbohA   0.11488495 -0.03763636  0.00000000 -0.69688768
## SnRK2   0.21909973 -0.19550340 -0.20182339  0.00000000
## ACO2    0.00000000  0.00000000 -0.84831678  0.16503002
## HSP70   0.31290916 -0.05802711 -0.02658670  0.09694596
## PR1b    0.31727130  0.43968061 -0.25582455  0.00000000
## RD29B   0.00000000  0.18514239  0.36562926  0.52383947
## X13.LOX 0.61196635  0.00000000  0.11119676 -0.08952333
## P5CS    0.00000000 -0.13307545  0.00000000 -0.33445732
## ERF1    0.00000000  0.07315419 -0.06058389  0.00000000
## CAT1    0.40209073  0.00000000  0.00000000  0.00000000
## CO      0.04040593 -0.15541798  0.09155034 -0.10056215
## SWEET   0.23861984  0.00000000  0.00000000  0.05562162
## SP6A    0.06927924  0.82049013  0.01197573 -0.15594627
## M0ZJG3  0.37506277 -0.09475562  0.12340235  0.21425330
## 
## $proteomics
##                             comp1      comp2      comp3       comp4
## VdnDe2_86784            0.0000000  0.0000000  0.0000000  0.00000000
## PBdnRY1_7025           -0.1311749  0.0000000  0.0000000  0.00000000
## VdnPW5_1541             0.0000000  0.0000000  0.0000000 -0.41510860
## CLCdnPW10_11437         0.0000000  0.0000000  0.0000000  0.42226947
## CLCdnDe2_96512          0.0000000  0.0000000  0.0000000  0.00000000
## VdnPW1_81435            0.0000000  0.0000000  0.0000000  0.00000000
## CLCdnDe6_4953          -0.1668200  0.0000000  0.0000000  0.00000000
## CLCdnDe13_5738          0.0000000  0.0000000  0.0000000  0.00000000
## CLCdnPW12_1472          0.0000000  0.0000000 -0.7727688 -0.08761565
## CLCdnPW49_8560          0.3775259  0.0000000  0.0234417  0.00000000
## VdnPW4_18801            0.0000000 -0.2797318  0.0000000  0.27790852
## SdnRY1_10712            0.0000000  0.0000000  0.2309571  0.00000000
## CLCdnDe2_311            0.0000000 -0.2286149  0.1434317  0.00000000
## CLCdnDe13_63179         0.0000000  0.0000000  0.0000000  0.11111066
## CLCdnDe10_24187         0.0000000  0.0000000  0.0000000  0.10297846
## TDe_comp110944_c2_seq4  0.0000000  0.2358210  0.0000000  0.19222059
## PBdnRY1_11085           0.0000000  0.0000000  0.0000000  0.00000000
## CLCdnPW48_15164         0.0000000 -0.3311016 -0.1176134  0.00000000
## CLCdnPW31_956           0.0000000  0.6172570  0.0000000  0.00000000
## VdnDe1_162036           0.0000000  0.1630251  0.0000000  0.00000000
## PBdnRY1_2239            0.5229278  0.0000000  0.0000000  0.00000000
## CLCdnPW22_15452         0.0000000  0.3250012 -0.1262230  0.00000000
## CLCdnPW50_2470          0.0000000  0.0000000  0.0000000  0.00000000
## VdnPW3_37011            0.1385400  0.0000000  0.0000000  0.00000000
## SdnPW1_46               0.1625022  0.0000000  0.0000000  0.00000000
## CLCdnDe7_34340          0.0000000  0.0000000  0.0000000  0.00000000
## CLCdnDe11_2286          0.0000000  0.0000000  0.0000000  0.00000000
## VdnDe2_53978            0.0000000  0.0000000 -0.3770223  0.48659753
## CLCdnDe7_6144           0.0000000  0.2743315 -0.1455825  0.00000000
## CLCdnRY10_6741          0.0000000  0.3212965  0.0000000  0.00000000
## 
## $Y
##           comp1       comp2        comp3       comp4
## C01 -0.33122257 -0.03974545  0.179453291 -0.41808135
## C07 -0.13945036 -0.03414944  0.270472474  0.50710418
## C08 -0.09539657  0.25752416  0.007316191 -0.28629300
## C14  0.01057355  0.79559987 -0.215039350  0.22566692
## H01 -0.29361360 -0.17847974  0.249614219 -0.31541180
## H07 -0.03445777 -0.39389515  0.145580486  0.52951240
## H08  0.00366778 -0.32213406 -0.848654836 -0.00746462
## H14  0.87989952 -0.08472017  0.211257525 -0.23503274
#plotLoadings(res, comp = 1, method = 'median')
#plotLoadings(res, comp = 1, method = 'median', contrib="max")
for( i in 1:ncomp)
plotLoadings(res, comp = i, method = 'median', contrib="max")

3.13 Plot variables

#plotVar(res, cutoff=0.5, legend = TRUE, overlap=!FALSE, style='graphics')
#plotVar(res, cutoff=0.5, legend = TRUE, overlap=FALSE, style='graphics')
plotVar(res, cutoff=0.5, legend = TRUE, comp=c(1,2), overlap=FALSE, style='ggplot2', col=cols)

plotVar(res, cutoff=0.5, legend = TRUE, comp=c(2,3), overlap=FALSE, col=cols)

4 Differential networks

Here we will show differential networks between treatments.

cutoffs <- c(0.7)
pairs <- combn(1:length(names(res$X)),2)
outn <- ""
j <- 4
cutoff <- 0.5
for(j in 1:ncol(pairs) ){
    pair <- pairs[,j]
    X <- data[[pair[1]]]
    Y <- data[[pair[2]]]
    datasets <- names(data)[pair]
    outn <- paste( outn, knit_child("023-prepare-networkdiff.Rmd", quiet=TRUE))
   for(cutoff in cutoffs){
   outn <- paste( outn, knit_child("035-Network.Rmd", quiet=TRUE))
   }
}
cat(outn)
size.variables <- 1
sim <-circosPlot(final.diablo.model, cutoff = 0.5, line = TRUE,
           color.blocks= cols,
           color.cor = c(3,2), size.labels = 1
           , size.variables = size.variables
           , xpd=TRUE)

circosPlot(final.diablo.model, cutoff = 0.75, line = TRUE,
           color.blocks= cols,
           color.cor = c(3,2), size.labels = 1
           , size.variables = size.variables
           , xpd=TRUE)

circosPlot(final.diablo.model, cutoff = 0.9, line = TRUE,
           color.blocks= cols,
           color.cor = c(3,2), size.labels = 1
           , size.variables = size.variables
           , xpd=TRUE)

5 Networks

We will prepare partial models for each treatment and treatment combination.

5.1 Network for C.

filter <- pdata$Treatment %in% .treat[1]
XX1 <- lapply(CCDATA, function(x) if(is.null(dim(x))) x[filter] else x[filter,])
table(XX1$status)
## 
## C01 C07 C08 C14 
##   4   4   4   4
res1 <- block.splsda(X = XX1[-1]
    , Y = as.factor(XX1[[1]])
    , ncomp = ncomp
    , keepX = list.keepX
    , design = design
    )
## Design matrix has changed to include Y; each block will be
##             linked to Y.
cutoff <- 0.0
x <- res1
layout.fun <- NULL
label <-.treat[1]
out23b <- ""
  out23b <- paste( out23b, knit_child("035b-multipartite-network.Rmd", quiet=TRUE))
N1 <- nw
cat(out23b)

5.1.1 Cutoff = 0

ndata <- length(data)
lbl <- gsub(", ","-",label)
nfn <- paste("network-035b",lbl,cutoff*10,sep="-")
#png(nfn, res = 600, width = 4000, height = 4000)
write(nfn, "bla.log", append=TRUE)
set.seed(1234)
nw <- my.network(x
        , blocks = 1:ndata
        , color.node = cols
        , cutoff = cutoff
        , shape.node = "rectangle"
        , layout = layout.fun
        , save = "png"
        , name.save = nfn
        )
#        title( #main=paste(names(data), sep=", "),
#        sub=paste("Cutoff = ",cutoff))
#        title(label,adj=0.8,outer=TRUE,line=-1)
#        legend("bottomright", pch=15,pt.cex=2,col=cols, legend=names(data),
#        bty="n")
#        text(ly[,1],ly[,2],names(V(nw$gR)))
#dev.off()

network-035b-C-0

Save network layout for further plots, used by layout function my.layout.

ly <- nw$layout
cutoff <- 0.7
x <- res1
layout.fun <- my.layout
label <- .treat[1]
out23b <- ""
  out23b <- paste( out23b, knit_child("035b-multipartite-network.Rmd", quiet=TRUE))
cat(out23b)

5.1.2 Cutoff = 0.7

ndata <- length(data)
lbl <- gsub(", ","-",label)
nfn <- paste("network-035b",lbl,cutoff*10,sep="-")
#png(nfn, res = 600, width = 4000, height = 4000)
write(nfn, "bla.log", append=TRUE)
set.seed(1234)
nw <- my.network(x
        , blocks = 1:ndata
        , color.node = cols
        , cutoff = cutoff
        , shape.node = "rectangle"
        , layout = layout.fun
        , save = "png"
        , name.save = nfn
        )
#        title( #main=paste(names(data), sep=", "),
#        sub=paste("Cutoff = ",cutoff))
#        title(label,adj=0.8,outer=TRUE,line=-1)
#        legend("bottomright", pch=15,pt.cex=2,col=cols, legend=names(data),
#        bty="n")
#        text(ly[,1],ly[,2],names(V(nw$gR)))
#dev.off()

network-035b-C-7

5.2 Network for H.

filter <- pdata$Treatment %in% .treat[2]
XX2 <- lapply(CCDATA, function(x) if(is.null(dim(x))) x[filter] else x[filter,])
table(XX2$status)
## 
## H01 H07 H08 H14 
##   4   4   4   4
res2 <- block.splsda(X = XX2[-1]
    , Y = as.factor(XX2[[1]])
    , ncomp = ncomp
    , keepX = list.keepX
    , design = design
    )
## Design matrix has changed to include Y; each block will be
##             linked to Y.
cutoff <- 0.0
x <- res2
layout.fun <- NULL
label <- .treat[2]
out23b <- ""
  out23b <- paste( out23b, knit_child("035b-multipartite-network.Rmd", quiet=TRUE))
N2 <- nw
cat(out23b)

5.2.1 Cutoff = 0

ndata <- length(data)
lbl <- gsub(", ","-",label)
nfn <- paste("network-035b",lbl,cutoff*10,sep="-")
#png(nfn, res = 600, width = 4000, height = 4000)
write(nfn, "bla.log", append=TRUE)
set.seed(1234)
nw <- my.network(x
        , blocks = 1:ndata
        , color.node = cols
        , cutoff = cutoff
        , shape.node = "rectangle"
        , layout = layout.fun
        , save = "png"
        , name.save = nfn
        )
#        title( #main=paste(names(data), sep=", "),
#        sub=paste("Cutoff = ",cutoff))
#        title(label,adj=0.8,outer=TRUE,line=-1)
#        legend("bottomright", pch=15,pt.cex=2,col=cols, legend=names(data),
#        bty="n")
#        text(ly[,1],ly[,2],names(V(nw$gR)))
#dev.off()

network-035b-H-0

Save layout for further plots, used by layout function my.layout.

ly <- nw$layout
cutoff <- 0.7
x <- res2
layout.fun <- my.layout
label <- .treat[2]
out23b <- ""
  out23b <- paste( out23b, knit_child("035b-multipartite-network.Rmd", quiet=TRUE))
cat(out23b)

5.2.2 Cutoff = 0.7

ndata <- length(data)
lbl <- gsub(", ","-",label)
nfn <- paste("network-035b",lbl,cutoff*10,sep="-")
#png(nfn, res = 600, width = 4000, height = 4000)
write(nfn, "bla.log", append=TRUE)
set.seed(1234)
nw <- my.network(x
        , blocks = 1:ndata
        , color.node = cols
        , cutoff = cutoff
        , shape.node = "rectangle"
        , layout = layout.fun
        , save = "png"
        , name.save = nfn
        )
#        title( #main=paste(names(data), sep=", "),
#        sub=paste("Cutoff = ",cutoff))
#        title(label,adj=0.8,outer=TRUE,line=-1)
#        legend("bottomright", pch=15,pt.cex=2,col=cols, legend=names(data),
#        bty="n")
#        text(ly[,1],ly[,2],names(V(nw$gR)))
#dev.off()

network-035b-H-7

Save network file for combined and single treatments. Networks are in objects res, res1 and res2.

write("Mid diablo 5 41 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!", "bla.log", append=TRUE)

5.3 Network for C and H.

# Complete network, cutoff = 0, both
datasets <- names(CCDATA[-1])
ndatasets<- length(datasets)
#
N12 <- network(res
    , cutoff = 0
    , blocks = 1:ndatasets
    , shape.node = c("rectangle")
    , save = "png"
    , name.save="network-CH"
    )
#
e <- extractEdges2(N12)
colnames(e)[ncol(e)] <- paste(.treat, collapse=".")
head(e)
##                                        edge      group1    from
## ho.IAA_me.Glukose         ho.IAA_me.Glukose hormonomics     IAA
## ho.oxIAA_me.Glukose     ho.oxIAA_me.Glukose hormonomics   oxIAA
## ho.IAA.Asp_me.Glukose ho.IAA.Asp_me.Glukose hormonomics IAA.Asp
## ho.ABA_me.Glukose         ho.ABA_me.Glukose hormonomics     ABA
## ho.PA_me.Glukose           ho.PA_me.Glukose hormonomics      PA
## ho.DPA_me.Glukose         ho.DPA_me.Glukose hormonomics     DPA
##                             group2      to         C.H
## ho.IAA_me.Glukose     metabolomics Glukose -0.07301277
## ho.oxIAA_me.Glukose   metabolomics Glukose  0.56720940
## ho.IAA.Asp_me.Glukose metabolomics Glukose -0.16468313
## ho.ABA_me.Glukose     metabolomics Glukose  0.79199700
## ho.PA_me.Glukose      metabolomics Glukose  0.82549058
## ho.DPA_me.Glukose     metabolomics Glukose  0.70122855
tail(e)
##                                                edge group1   from
## qP.ERF1_pr.VdnDe1_39992     qP.ERF1_pr.VdnDe1_39992   qPCR   ERF1
## qP.CAT1_pr.VdnDe1_39992     qP.CAT1_pr.VdnDe1_39992   qPCR   CAT1
## qP.CO_pr.VdnDe1_39992         qP.CO_pr.VdnDe1_39992   qPCR     CO
## qP.SWEET_pr.VdnDe1_39992   qP.SWEET_pr.VdnDe1_39992   qPCR  SWEET
## qP.SP6A_pr.VdnDe1_39992     qP.SP6A_pr.VdnDe1_39992   qPCR   SP6A
## qP.M0ZJG3_pr.VdnDe1_39992 qP.M0ZJG3_pr.VdnDe1_39992   qPCR M0ZJG3
##                               group2           to        C.H
## qP.ERF1_pr.VdnDe1_39992   proteomics VdnDe1_39992 -0.6966147
## qP.CAT1_pr.VdnDe1_39992   proteomics VdnDe1_39992 -0.7764547
## qP.CO_pr.VdnDe1_39992     proteomics VdnDe1_39992 -0.7394983
## qP.SWEET_pr.VdnDe1_39992  proteomics VdnDe1_39992 -0.7475032
## qP.SP6A_pr.VdnDe1_39992   proteomics VdnDe1_39992 -0.5797108
## qP.M0ZJG3_pr.VdnDe1_39992 proteomics VdnDe1_39992 -0.7440226
dim(e)
## [1] 1983    6
# treatment 1
e1 <- extractEdges2(N1)
colnames(e1)[ncol(e1)] <- .treat[1]
head(e1)
##                                        edge      group1    from
## ho.IAA_me.Glukose         ho.IAA_me.Glukose hormonomics     IAA
## ho.oxIAA_me.Glukose     ho.oxIAA_me.Glukose hormonomics   oxIAA
## ho.IAA.Asp_me.Glukose ho.IAA.Asp_me.Glukose hormonomics IAA.Asp
## ho.ABA_me.Glukose         ho.ABA_me.Glukose hormonomics     ABA
## ho.PA_me.Glukose           ho.PA_me.Glukose hormonomics      PA
## ho.DPA_me.Glukose         ho.DPA_me.Glukose hormonomics     DPA
##                             group2      to           C
## ho.IAA_me.Glukose     metabolomics Glukose -0.03981674
## ho.oxIAA_me.Glukose   metabolomics Glukose  0.39181219
## ho.IAA.Asp_me.Glukose metabolomics Glukose -0.40212075
## ho.ABA_me.Glukose     metabolomics Glukose  0.71808293
## ho.PA_me.Glukose      metabolomics Glukose  0.89196374
## ho.DPA_me.Glukose     metabolomics Glukose  0.79288478
dim(e1)
## [1] 1988    6
e <- merge(e,e1, sort=FALSE, all=TRUE)
head(e)
##                    edge      group1    from       group2      to
## 1     ho.IAA_me.Glukose hormonomics     IAA metabolomics Glukose
## 2   ho.oxIAA_me.Glukose hormonomics   oxIAA metabolomics Glukose
## 3 ho.IAA.Asp_me.Glukose hormonomics IAA.Asp metabolomics Glukose
## 4     ho.ABA_me.Glukose hormonomics     ABA metabolomics Glukose
## 5      ho.PA_me.Glukose hormonomics      PA metabolomics Glukose
## 6     ho.DPA_me.Glukose hormonomics     DPA metabolomics Glukose
##           C.H           C
## 1 -0.07301277 -0.03981674
## 2  0.56720940  0.39181219
## 3 -0.16468313 -0.40212075
## 4  0.79199700  0.71808293
## 5  0.82549058  0.89196374
## 6  0.70122855  0.79288478
tail(e)
##                          edge       group1 from     group2
## 2124 me.Thr_pr.CLCdnRY10_6741 metabolomics  Thr proteomics
## 2125           me.Thr_qP.ACO2 metabolomics  Thr       qPCR
## 2126          me.Thr_qP.SWEET metabolomics  Thr       qPCR
## 2127  qP.CAT1_pr.VdnDe2_86784         qPCR CAT1 proteomics
## 2128   me.Thr_pr.PBdnRY1_7025 metabolomics  Thr proteomics
## 2129  ho.SA_pr.CLCdnDe11_2286  hormonomics   SA proteomics
##                  to C.H          C
## 2124 CLCdnRY10_6741  NA -0.2004261
## 2125           ACO2  NA -0.3198836
## 2126          SWEET  NA -0.5781006
## 2127   VdnDe2_86784  NA  0.6600324
## 2128   PBdnRY1_7025  NA -0.1432746
## 2129 CLCdnDe11_2286  NA  0.2515391
# treatment 2
.treat[2]
## [1] "H"
e2 <- extractEdges2(N2)
colnames(e2)[ncol(e2)] <- .treat[2]
head(e2)
##                                        edge      group1    from
## ho.IAA_me.Glukose         ho.IAA_me.Glukose hormonomics     IAA
## ho.oxIAA_me.Glukose     ho.oxIAA_me.Glukose hormonomics   oxIAA
## ho.IAA.Asp_me.Glukose ho.IAA.Asp_me.Glukose hormonomics IAA.Asp
## ho.ABA_me.Glukose         ho.ABA_me.Glukose hormonomics     ABA
## ho.PA_me.Glukose           ho.PA_me.Glukose hormonomics      PA
## ho.DPA_me.Glukose         ho.DPA_me.Glukose hormonomics     DPA
##                             group2      to          H
## ho.IAA_me.Glukose     metabolomics Glukose -0.1079095
## ho.oxIAA_me.Glukose   metabolomics Glukose  0.2878757
## ho.IAA.Asp_me.Glukose metabolomics Glukose -0.3166965
## ho.ABA_me.Glukose     metabolomics Glukose  0.8981663
## ho.PA_me.Glukose      metabolomics Glukose  0.8529442
## ho.DPA_me.Glukose     metabolomics Glukose  0.9111266
dim(e2)
## [1] 1930    6
e <- merge(e,e2, sort=FALSE, all=TRUE)
head(e)
##                    edge      group1    from       group2      to
## 1     ho.IAA_me.Glukose hormonomics     IAA metabolomics Glukose
## 2   ho.oxIAA_me.Glukose hormonomics   oxIAA metabolomics Glukose
## 3 ho.IAA.Asp_me.Glukose hormonomics IAA.Asp metabolomics Glukose
## 4     ho.ABA_me.Glukose hormonomics     ABA metabolomics Glukose
## 5      ho.PA_me.Glukose hormonomics      PA metabolomics Glukose
## 6     ho.DPA_me.Glukose hormonomics     DPA metabolomics Glukose
##           C.H           C          H
## 1 -0.07301277 -0.03981674 -0.1079095
## 2  0.56720940  0.39181219  0.2878757
## 3 -0.16468313 -0.40212075 -0.3166965
## 4  0.79199700  0.71808293  0.8981663
## 5  0.82549058  0.89196374  0.8529442
## 6  0.70122855  0.79288478  0.9111266
tail(e)
##                             edge       group1   from     group2
## 2172     ho.JA_pr.CLCdnDe13_5738  hormonomics     JA proteomics
## 2173     ho.PA_pr.CLCdnDe13_5738  hormonomics     PA proteomics
## 2174 qP.M0ZJG3_pr.CLCdnDe13_5738         qPCR M0ZJG3 proteomics
## 2175     ho.SA_pr.CLCdnDe13_5738  hormonomics     SA proteomics
## 2176      me.Thr_pr.VdnDe1_39992 metabolomics    Thr proteomics
## 2177    ho.DPA_pr.CLCdnDe13_5738  hormonomics    DPA proteomics
##                  to C.H  C           H
## 2172 CLCdnDe13_5738  NA NA -0.05819037
## 2173 CLCdnDe13_5738  NA NA  0.10518655
## 2174 CLCdnDe13_5738  NA NA -0.03546139
## 2175 CLCdnDe13_5738  NA NA  0.20878901
## 2176   VdnDe1_39992  NA NA -0.37066988
## 2177 CLCdnDe13_5738  NA NA -0.06906009
#
write("Mid diablo 5 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!", "bla.log", append=TRUE)

Compose file name and necessary information for network export file

file <- paste0("network-",paste(.treat, collapse="_"),"-",paste(datasets, collapse="_"),".txt")
label0 <- paste(paste(.treat, collapse=", "),"|",paste(datasets, collapse=", "),"; cutoff =",0)
title <- label0
sets <- 1:length(DATA)
suffix <- paste0(substr(names(DATA),1,2)[sets[-1]],collapse="-")
write("Mid diablo 6 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!", "bla.log", append=TRUE)
write(file.path(suffix,file), "bla.log", append=TRUE)
write(file, "bla.log", append=TRUE)
length(str(e))
## 'data.frame':    2177 obs. of  8 variables:
##  $ edge  : chr  "ho.IAA_me.Glukose" "ho.oxIAA_me.Glukose" "ho.IAA.Asp_me.Glukose" "ho.ABA_me.Glukose" ...
##  $ group1: chr  "hormonomics" "hormonomics" "hormonomics" "hormonomics" ...
##  $ from  : chr  "IAA" "oxIAA" "IAA.Asp" "ABA" ...
##  $ group2: chr  "metabolomics" "metabolomics" "metabolomics" "metabolomics" ...
##  $ to    : chr  "Glukose" "Glukose" "Glukose" "Glukose" ...
##  $ C.H   : num  -0.073 0.567 -0.165 0.792 0.825 ...
##  $ C     : num  -0.0398 0.3918 -0.4021 0.7181 0.892 ...
##  $ H     : num  -0.108 0.288 -0.317 0.898 0.853 ...
## [1] 0

5.4 Export edges table

#e <- data.frame(x=1:10,y=1:10)
#my.write.table(e, file="network.txt",meta=FALSE)
write.table(e, file = file, na="0")

Table with edges for networks based on combined treatments (C, H) and single treatments (C) and (H) is exported as a text file. This table can be used for inspection and filtering out edges based on selected cutoff. Missing edges are labeled as weight 0. This enables numeric filtration in other visualization or analysis files (e.g. Excel).

write("End diablo 7 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!", "bla.log", append=TRUE)
write("From 035-DIABLO !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!", "bla.log", append=TRUE)

6 SessionInfo

Windows 10 x64 (build 19045)
R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale: [1] LC_COLLATE=Slovenian_Slovenia.1250 [2] LC_CTYPE=Slovenian_Slovenia.1250
[3] LC_MONETARY=Slovenian_Slovenia.1250 [4] LC_NUMERIC=C
[5] LC_TIME=Slovenian_Slovenia.1250
system code page: 1252

attached base packages: [1] grid stats graphics utils datasets grDevices [7] methods base

other attached packages: [1] pheatmap_1.0.12 ComplexHeatmap_2.6.2 igraph_1.2.6
[4] mixOmics_6.14.0 ggplot2_3.3.5 lattice_0.22-5
[7] MASS_7.3-60.0.1 knitr_1.43 rmarkdown_2.21

loaded via a namespace (and not attached): [1] ggrepel_0.9.0 Rcpp_1.0.7 circlize_0.4.15
[4] tidyr_1.1.2 corpcor_1.6.9 png_0.1-7
[7] digest_0.6.35 RSpectra_0.16-0 R6_2.5.1
[10] plyr_1.8.6 stats4_4.0.2 ellipse_0.4.2
[13] evaluate_0.21 highr_0.8 pillar_1.4.7
[16] GlobalOptions_0.1.2 rlang_1.1.1 jquerylib_0.1.4
[19] S4Vectors_0.28.1 GetoptLong_1.0.5 Matrix_1.6-5
[22] labeling_0.4.2 rARPACK_0.11-0 stringr_1.4.0
[25] munsell_0.5.0 compiler_4.0.2 xfun_0.39
[28] pkgconfig_2.0.3 BiocGenerics_0.36.0 shape_1.4.6
[31] htmltools_0.5.2 tidyselect_1.1.0 tibble_3.0.4
[34] gridExtra_2.3 IRanges_2.24.1 matrixStats_1.2.0
[37] crayon_1.3.4 dplyr_1.0.2 withr_3.0.0
[40] jsonlite_1.8.8 gtable_0.3.0 lifecycle_0.2.0
[43] magrittr_2.0.1 scales_1.1.1 stringi_1.5.3
[46] farver_2.0.3 reshape2_1.4.4 bslib_0.3.1
[49] ellipsis_0.3.1 generics_0.1.0 vctrs_0.4.1
[52] rjson_0.2.20 RColorBrewer_1.1-2 tools_4.0.2
[55] Cairo_1.5-15 glue_1.4.2 purrr_0.3.4
[58] parallel_4.0.2 fastmap_1.1.1 yaml_2.2.1
[61] clue_0.3-60 colorspace_1.4-1 cluster_2.1.0
[64] sass_0.4.0